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Abstract 

This paper expounds the relations between continuous symmetries and con- 
served quantities, i.e. Noether's "first theorem", in both the Lagrangian and 
Hamiltonian frameworks for classical mechanics. This illustrates one of mechan- 
ics' grand themes: exploiting a symmetry so as to reduce the number of variables 
needed to treat a problem. 

I emphasise that, for both frameworks, the theorem is underpinned by the 
idea of cyclic coordinates; and that the Hamiltonian theorem is more powerful. 
The Lagrangian theorem's main "ingredient", apart from cyclic coordinates, is 
the rectification of vector fields afforded by the local existence and uniqueness of 
solutions to ordinary differential equations. For the Hamiltonian theorem, the 
main extra ingredients are the asymmetry of the Poisson bracket, and the fact 
that a vector field generates canonical transformations iff it is Hamiltonian. 



1 email: jb56@cus.cam.ac.uk; jeremy.butterfield@all-souls.oxford.ac.uk 

2 It is a pleasure to dedicate this paper to Jeff Bub, who has made such profound contributions to 
the philosophy of quantum theory. Though the paper is about classical, not quantum, mechanics, I 
hope that with his love of geometry, he enjoys symplectic forms as much as inner products! 
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1 Introduction 



The strategy of simplifying a mechanical problem by exploiting a symmetry so as 
to reduce the number of variables is one of classical mechanics' grand themes. It is 
theoretically deep, practically important, and recurrent in the history of the subject. 
Indeed, it occurs already in 1687, in Newton's solution of the Kepler problem; (or more 
generally, the problem of two bodies exerting equal and opposite forces along the line 
between them). The symmetries are translations and rotations, and the corresponding 
conserved quantities are the linear and angular momenta. 

This paper will expound one central aspect of this large subject. Namely, the re- 
lations between continuous symmetries and conserved quantities — in effect, Noether's 
"first theorem": which I expound in both the Lagrangian and Hamiltonian frame- 
works, though confining myself to finite-dimensional systems. As we shall see, this 
topic is underpinned by the theorems in elementary Lagrangian and Hamiltonian me- 
chanics about cyclic (ignorable) coordinates and their corresponding conserved mo- 
menta. (Again, there is a glorious history: these theorems were of course clear to these 
subjects' founders.) Broadly speaking, my discussion will make increasing use, as it 
proceeds, of the language of modern geometry. It will also emphasise Hamiltonian, 
rather than Lagrangian, mechanics: apart from mention of the Legendre transforma- 
tion, the Lagrangian framework drops out wholly after Section 13.4. II 3 

There are several motivations for studying this topic. As regards physics, many of 
the ideas and results can be generalized to infinite-dimensional classical systems; and 
in either the original or the generalized form, they underpin developments in quantum 
theories. The topic also leads into another important subject, the modern theory of 
symplectic reduction: (for a philosopher's introduction, cf. Butterfield (2006)). As 
regards philosophy, the topic is a central focus for the discussion of symmetry, which is 
both a long-established philosophical field and a currently active one: cf. Brading and 
Castellani (2003). (Some of the current interest relates to symplectic reduction, whose 
philosophical significance has been stressed recently, especially by Belot: Butterfield 
(2006) gives references.) 

The plan of the paper is as follows. In Section EJ I review the elements of the 
Lagrangian framework, emphasising the elementary theorem that cyclic coordinates 
yield conserved momenta, and introducing the modern geometric language in which 
mechanics is often cast. Then I review Noether's theorem in the Lagrangian frame- 
work (Sectional). I emphasise how the theorem depends on two others: the elementary 
theorem about cyclic coordinates, and the local existence and uniqueness of solutions 
of ordinary differential equations. Then I introduce Hamiltonian mechanics, again em- 
phasising how cyclic coordinates yield conserved momenta; and approaching canonical 
transformations through the symplectic form (Section EJ). This leads to Sectionals 
discussion of Poisson brackets; and thereby, of the Hamiltonian version of Noether's 

3 It is worth noting the point, though I shall not exploit it, that symplectic structure can be seen 
in the classical solution space of the Lagrangian framework; cf. (3) of Section 1^71 
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theorem. In particular, we see what it would take to prove that this version is more 
powerful than (encompasses) the Lagrangian version. By the end of the Section, it 
only remains to show that a vector field generates a one-parameter family of canonical 
transformations iff it is a Hamiltonian vector field. It turns out that we can show 
this without having to develop much of the theory of canonical transformations. We 
do so in the course of the final Section's account of the geometric structure of Hamil- 
tonian mechanics, especially the symplectic structure of a cotangent bundle (Section 
EJ). Finally, we end the paper by mentioning a generalized framework for Hamiltonian 
mechanics which is crucial for symplectic reduction. This framework takes the Poisson 
bracket, rather than the symplectic form, as the basic notion; with the result that 
the state-space is, instead of a cotangent bundle, a generalization called a 'Poisson 
manifold'. 

2 Lagrangian mechanics 
2.1 Lagrange's equations 

We consider a mechanical system with n configurational degrees of freedom (for short: 
n freedoms), described by the usual Lagrange's equations. These are n second-order 
ordinary differential equations: 



where the Lagrangian L is the difference of the kinetic and potential energies: L := 
K — V . (We use K for the kinetic energy, not the traditional T; for in differential 
geometry, we will use T a lot, both for 'tangent space' and 'derivative map'.) 

I should emphasise at the outset that several special assumptions are needed in or- 
der to deduce eq. 12. II from Newton's second law, as applied to the system's component 
parts: (assumptions that tend to get forgotten in the geometric formulations that will 
dominate later Sections!) But I will not go into many details about this, since: 

(i) : there is no single set of assumptions of mimimum logical strength (nor a single 
"best package-deal" combining simplicity and mimimum logical strength); 

(ii) : full discussions are available in many textbooks (or, from a philosophical view- 
point, in Butterfield 2004a: Section 3). 

I will just indicate a simple and commonly used sufficient set of assumptions. But 
owing to (i) and (ii), the details here will not be cited in later Sections. 

Note first that if the system consists of N point-particles (or bodies small enough to 
be treated as point-particles), so that a configuration is fixed by 3iV cartesian coordi- 
nates, we may yet have n < 3N. For the system may be subject to constraints and we 
will require the q 1 to be independently variable. More specifically, let us assume that 
any constraints on the system are holonomic; i.e. each is expressible as an equation 
/(r 1 , . . . , r m ) = among the coordinates r k of the system's component parts; (here the 





A 



r k could be the 3N cartesian coordinates of iV point-particles, in which case m := 3N). 
A set of c such constraints can in principle be solved, defining a (m — c)-dimensional 
hypersurface Q in the m-dimensional space of the rs; so that on the configuration space 
Q we can define n : = m — c independent coordinates q l , i = 1, . . . , n. 

Let us also assume that any constraints on the system are: (i) scleronomous, i.e. 
independent of time, so that Q is identified once and for all; (ii) ideal, i.e. the forces 
that maintain the constraints would do no work in any possible displacement consistent 
with the constraints and applied forces (a 'virtual displacement'). Let us also assume 
that the forces applied to the system are monogenic: i.e. the total work 5w done in 
an infinitesimal virtual displacement is integrable; its integral is the work function U. 
(The term 'monogenic' is due to Lanczos (1986, p. 30), but followed by others e.g. 
Goldstein et al. (2002, p. 34).) And let us assume that the system is conservative: i.e. 
the work function U is independent of both the time and the generalized velocities 
and depends only on the q % : U = U (q 1 , . . . , q n ). 

So to sum up: let us assume that the constraints are holonomic, scleronomous and 
ideal, and that the system is monogenic with a velocity-independent work-function. 
Now let us define K to be the kinetic energy; i.e. in cartesian coordinates, with k now 
labelling particles, K := E^^feVfe- Let us also define V := —U to be the potential 
energy, and set L := K — V. Then the above assumptions imply eq. 12. II 4 

To solve mechanical problems, we need to integrate Lagrange's equations. Recall 
the idea from elementary calculus that n second-order ordinary differential equations 
have a (locally) unique solution, once we are given In arbitrary constants. Broadly 
speaking, this idea holds good for Lagrange's equations; and the 2n arbitrary constants 
can be given just as one would expect — as the initial configuration and generalized 
velocities q l (t ), q l (t Q ) at time to- More precisely: expanding the time derivatives in eq. 
12.11 we get 

dq j dq l dq j dq l dtdq 1 dq l 

so that the condition for being able to solve these equations to find the accelerations 
at some initial time t , q l (to), in terms of q l (to), q l (t ) is that the Hessian matrix q^q^ 
be nonsingular. Writing the determinant as | |, and partial derivatives as subscripts, 
the condition is that: 

I* 1 s |L «'' * °- (2 - 3) 

This Hessian condition holds in very many mechanical problems; and henceforth, we 
assume it. (If it fails, we enter the territory of constrained dynamics; for which cf. e.g. 
Henneaux and Teitelboim (1992, Chapters 1-5).) It underpins most of what follows: for 
it is needed to define the Legendre transformation, by which we pass from Lagrangian 



4 Though I shall not develop any details, there is of course a rich theory about these and related 
assumptions. One example, chosen with an eye to our later use of geometry, is that assuming scle- 
ronomous constraints, K is readily shown to be a homogeneous quadratic form in the generalized 
velocities, i.e. of the form K = E^jaijq 1 ^; and so K defines a metric on the configuration space. 



to Hamiltonian mechanics. 

Of course, even with eq. 12.31 it is still in general hard in practice to solve for the 
q l (t ): they are buried in the lhs of eq. 12.21 In (5) of Section T2.2.21 this will motivate 
the move to Hamiltonian mechanics. 5 

Given eq. 12.31 and so the accelerations at the initial time to, the basic theorem on 
the (local) existence and uniqueness of solutions of ordinary differential equations can 
be applied. (We will state this theorem in Section 13.41 in connection with Noether's 
theorem.) 

By way of indicating the rich theory that can be built from eq. l2.ll and F2.3t I mention 
one main aspect: the power of variational formulations. Eq. 12. H are the Euler-Lagrange 
equations for the variational problem 5 f L dt = 0; i.e. they are necessary and sufficient 
for the action integral / = J L dt to be stationary. But variational principles will play 
no further role in this paper; (Butterfield 2004 is a philosophical discussion). 

But our main concern, here and throughout this paper, is how symmetries yield 
conserved quantities, and thereby reduce the number of variables that need to be 
considered in solving a problem. In fact, we are already in a position to prove Noether's 
theorem, to the effect that any (continuous) symmetry of the Lagrangian L yields a 
conserved quantity. But we postpone this to Sectional until we have developed some 
more notions, especially geometric ones. 

We begin with the idea of generalized momenta, and the result that the generalized 
momentum of any cyclic coordinate is a constant of the motion: though very simple, 
this result is the basis of Noether's theorem. Elementary examples prompt the defini- 
tion of the generalized, or canonical, momentum, p i} conjugate to a coordinate q l as: 
(this was first done by Poisson in 1809). Note that pi need not have the dimen- 
sions of momentum: it will not if q % does not have the dimension length. So Lagrange's 
equations can be written: 

d dL , . 

5* = a? ; (2 ' 4) 

We say a coordinate q % is cyclic if L does not depend on q l . (The term comes from the 
example of an angular coordinate of a particle subject to a central force. Another term 
is: ignorable.) Then the Lagrange equation for a cyclic coordinate, q n say, becomes 
p n = 0, implying 

p n = constant, c n say. (2-5) 

So: the generalized momentum conjugate to a cyclic coordinate is a constant of the 
motion. 

It is straightforward to show that this simple result encompasses the elementary 
theorems of the conservation of momentum, angular momentum and energy: this last 
corresponding to time's being a cyclic coordinate. As a simple example, consider the 



5 This is not to say that Hamiltonian mechanics makes all problems "explicitly soluble" : if only! 
For a philosophical discussion of the various meanings of 'explicit solution', cf. Butterfield (2004a: 
Section 2.1). 
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angular momentum of a free particle. The Lagrangian is, in spherical polar coordinates, 



which is the angular momentum about the z-axis, is conserved. 

2.2 Geometrical perspective 
2.2.1 Some restrictions of scope 

I turn to give a brief description of the elements of Lagrangian mechanics in terms of 
modern differential geometry. Here 'brief indicates that: 

(i) : I will assume without explanation various geometric notions, in particular: 
manifold, vector, 1-form (covector), metric, Lie derivative and tangent bundle. 

(ii) : I will disregard issues about degrees of smoothness: all manifolds, scalars, 
vectors etc. will be assumed to be as smooth as needed for the context. 

(iii) : I will also simplify by speaking "globally, not locally". I will speak as if the 
scalars, vector fields etc. are defined on a whole manifold; when in fact all that we can 
claim in application to most systems is a corresponding local statement — because for 
example, differential equations are guaranteed the existence and uniqueness only of a 
local solution. 6 

We begin by assuming that the configuration space (i.e. the constraint surface) Q 
is a manifold. The physical state of the system, taken as a pair of configuration and 
generalized velocities, is represented by a point in the tangent bundle TQ (also known 
as 'velocity phase space'). That is, writing T x for the tangent space at x G Q, TQ has 
points (x,t),x G Q,t G T x . We will of course often work with the natural coordinate 
systems on TQ induced by coordinate systems q on Q; i.e. with the In coordinates 

(q, 0) = (?*>?*)• 

The main idea of the geometric perspective is that this tangent bundle is the arena 
for Lagrangian mechanics. So various previous notions and results are now expressed 
in terms of the tangent bundle. In particular, the Lagrangian is a scalar function 
L : TQ — > 1R which "determines everything" . And the conservation of the generalized 
momentum p n conjugate to a cyclic coordinate q n , p n = p n (q,q) = c n , means that 
the motion of the system is confined to a level set p~ l (c n ): where this level set is a 
(2n — l)-dimensional sub-manifold of TQ. 

6 A note for aficionados. Of the three main pillars of elementary differential geometry — the implicit 
function theorem, the local existence and uniqueness of solutions of ordinary differential equations, 
and Frobenius' theorem — this paper will use the first only implicitly (!), and the second explicitly in 
Sections |3 and 0| The third will not be used. 




(2.6) 




(2.7) 
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But I must admit at the outset that working with TQ involves limiting our discus- 
sion to (a) time-independent Lagrangians and (b) time-independent coordinate trans- 
formations. 

(a) : Recall Section l2~Tl s assumptions that secured eq. 12.11 Velocity-dependent po- 
tentials and-or rheonomous constraints would prompt one to use what is often called 
the 'extended configuration space' Q x 1R, and-or the 'extended velocity phase space' 
TQ x K. 

(b) : So would time-dependent coordinate transformations. This is a considerable 
limitation from a philosophical viewpoint, since it excludes boosts, which are central to 
the philosophical discussion of spacetime symmetry groups, and especially of relativity 
principles. To give the simplest example: the Lagrangian of a free particle is just its 
kinetic energy, which can be made zero by transforming to the particle's rest frame; 
i.e. it is not invariant under boosts. 



2.2.2 The tangent bundle 

With these limitations admitted, we now describe Lagrangian mechanics on TQ, in 
five extended comments. 

(1): 2n first-order equations; the Hessian again: — 
The Lagrangian equations of motion are now 2n first- order equations for the functions 
q l (t),q l (t), falling in to two groups: 

(a) the n equations eq. 12.21 with the q 1 taken as the time derivatives of q 1 with 
respect to t; i.e. we envisage using the Hessian condition eq. 12.31 to solve eq. 12.21 for 
the q\ hard though this usually is to do in practice; 



(b) the n equations q l 



dq l 
dt 



(2): Vector fields and solutions: — 

(a) : These 2n first-order equations are equivalent to a vector field on TQ: the 
'dynamical vector field', or for short the 'dynamics'. I write it as D (to distinguish it 
from the generic vector field X,Y, ...). 

(b) : In the natural coordinates (q l ,q l ), the vector field D is expressed as 

D =*w + *w • (2 - 8) 

and the rate of change of any dynamical variable /, taken as a scalar function on TQ, 
f(q, q) G 1R is given by 

f-f (2 - 9) 

(c) : So the Lagrangian L determines the dynamical vector field D, and so (for 
given initial q, q) a (locally unique) solution: an integral curve of D, 2n functions of time 
q(t),q(t) (with the first n functions determining the latter). This separation of solu- 
tions/trajectories within TQ is important for the visual and qualitative understanding 
of solutions. 
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(3): Canonical momenta are 1-forms: — 
Any point transformation, or any coordinate transformation (q l ) — > (q n ), in the con- 
figuration manifold Q, induces a basis-change in the tangent space T q at q 6 Q. 
Consider any vector r e T g with components in coordinate system (<f ) on Q, i.e. 
r = ^ = q % -§-i] (think of a motion through configuration q with generalized velocity r). 
Its components g' 4 in the coordinate system (q' % ) (i.e. r = q % -^r-) are given by applying 
the chain rule to q l = q' t (q k ): 

dq k 



«* - Sfl*- (2-10) 



so that we can "drop the dots": 



dq % dq 



dqi dqi 

One easily checks, using eq. 12.11} that for any L, the canonical momenta pi 
form a 1-form on Q, transforming under (q l ) — > (q n ) by: 



(2.11) 



dL 



, _ dL' _ dq k dL _ dq k 
Pl '~ dp ~ dq'*dq k ~ dq'* Pk 1 ' 

That is, the canonical momenta defined by L form a 1-form field on Q. (We will later 
describe this as a cross-section of the cotangent bundle.) 

(4): Geometric formulation of Lagrange's equations: — 
We can formulate Lagrange's equations in a coordinate-independent way, by using 
three ingredients, namely: 

(i) : L itself (a scalar, so coordinate-independent); 

(ii) : the vector field D that L defines; and 

(iii) : the 1-form on TQ defined locally, in terms of the natural coordinates {q l ,q l ), 

by 

Ql := w dq* . (2.13) 

(So the coefficients of Bl for the other n elements of the dual basis, the dq 1 are defined 
to be zero.) This 1-form is called the canonical 1-form. We shall see that it plays a 
role in Noether's theorem, and is centre-stage in Hamiltonian mechanics. 

We combine these three ingredients using the idea of the Lie derivative of a 1-form 
along a vector field. 

We will write the Lie derivative of 9l along the vector field D on TQ, as £d@l- (It 
is sometimes written as L; but we need the symbol L for the Lagrangian — and later 
on, for left translation.) By the Leibniz rule, Cd^l is: 

C D 6 L = (£dq-W + qT-ZdW) • (2.14) 

But the Lie derivative of any scalar function / : TQ — ► 1R along any vector field X is 
just X(f); and for the dynamical vector field D, this is just / = ^q l + §^q l . So we 
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have 

«-0 ?,+ S rf «"- (2 - i5) 

Rewriting the first term by the Lagrange equations, we get 

BT BT 

C D 9 L = (— )dq i + —dq* = dL . (2.16) 

We can conversely deduce the familiar Lagrange equations from eq. 12.161 by taking 
coordinates. So we conclude that these equations' coordinate-independent form is: 



C D 9 L = dL . (2.17) 

(5): Towards the Hamiltonian framework — 
Finally, a comment about the Lagrangian framework's limitations as regards solving 
problems, and how they prompt the transition to Hamiltonian mechanics. 

Recall the remark at the end of Section I2.1| that the n equations eq. 12.21 are in 
general hard to solve for the q l {to)'- they lie buried in the left hand side of eq. 12.21 On 
the other hand, the n equations q l = -§- (the second group of n equations in (1) above) 
are as simple as can be. 

This makes it natural to seek another 2n-dimensional space of variables, £ Q say 
(a = 1, 2n), in which: 

(i) : a motion is described by first-order equations, so that we have the same 
advantage as in TQ that a unique trajectory passes through each point of the space; 
but in which 

(ii) : all 2n equations have the simple form = / Q (^ 1 , ...£ 2n ) for some set of 
functions f a (a = 1, 2n). 

Indeed, Hamiltonian mechanics provides exactly such a space: it is usually 
the cotangent bundle of the configuration manifold, instead of its tangent bundle. 
But before turning to that, we expound Noether's theorem in the current Lagrangian 
framework. 



3 Noether's theorem in Lagrangian mechanics 
3.1 Preamble: a modest plan 

Any discussion of symmetry in Lagrangian mechanics must include a treatment of 
"Noether's theorem". The scare quotes are to indicate that there is more than one 
Noether's theorem. Quite apart from Noether's work in other branches of mathematics, 
her paper (1918) on symmetries and conserved quantities in Lagrangian theories has 
several theorems. I will be concerned only with applying her first theorem to finite- 
dimensional systems. In short: it provides, for any continuous symmetry of a system's 
Lagrangian, a conserved quantity called the 'momentum conjugate to the symmetry'. 
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I stress at the outset that the great majority of subsequent applications and com- 
mentaries (also for her other theorems, besides her first) are concerned with versions 
of the theorems for infinite (i.e. continuous) systems. In fact, the context of Noether's 
investigation was contemporary debate about how to understand conservation prin- 
ciples and symmetries in the "ultimate classical continuous system", viz. gravitating 
matter as described by Einstein's general relativity. This theory can be given a La- 
grangian formulation: that is, the equations of motion, i.e. Einstein's field equations, 
can be deduced from a Hamilton's Principle with an appropriate Lagrangian. The 
contemporary debate was especially about the conservation of energy and the principle 
of general covariance (also known as: diffeomorphism invariance). General covariance 
prompts one to consider how a variational principle transforms under spacetime coor- 
dinate transformations that are arbitrary, in the sense of varying from point to point. 
This leads to the idea of "local" symmetries, which since Noether's time has been im- 
mensely fruitful in both classical and quantum physics, and in both a Lagrangian and 
Hamiltonian framework. 7 

So I agree that from the perspective of Noether's work, and its enormous later de- 
velopment, this Section's application of the first theorem to finite-dimensional systems 
is, as they say, "trivial". Furthermore, this application is easily understood, without 
having to adopt that perspective, or even having to consider infinite systems. In other 
words: its statement and proof are natural, and simple, enough that the nineteenth 
century masters of mechanics, like Hamilton, Jacobi and Poincare, would certainly rec- 
ognize it in their own work — allowing of course for adjustments to modern language. 
In fact, versions of it for the Galilei group of Newtonian mechanics and the Lorentz 
group of special relativity were published a few years before Noether's paper; (Brading 
and Brown (2003, p. 90); for details, cf. Kastrup (1987)). 8 

Nevertheless, it is worth expounding the finite-system version of Noether's first 
theorem. For: 

(i) : It generalizes Section 12. If s result about cyclic coordinates, and thereby the 
elementary theorems of the conservation of momentum, angular momentum and energy 
which that result encompasses. The main generalization is that the theorem does not 
assume we have identified a cyclic coordinate. But on the other hand: every symmetry 
in the Noether sense will arise from a cyclic coordinate in some system q of generalized 
coordinates. (As we will see, this follows from the local existence and uniqueness of 
solutions of ordinary differential equations.) 

(ii) : This exposition will also prepare the way for our discussion of symmetry and 

7 Cf. Brading and Castellani (2003). Apart from papers specifically about Noether's theorem, this 
anthology's papers by Wallace, Belot and Earman (all 2003) are closest to this paper's concerns. 

8 Here again, 'versions of it' needs scare-quotes. For in what follows, I shall be more limited than 
these proofs, in two ways. (1): I limit myself, as I did in Section 12.2.11 both to time-independent 
Lagrangians and to time-independent transformations: so my discussion does not encompass boosts. 
(2): I will take a symmetry of L to require that L be the very same; whereas some treatments 
allow the addition to L of the time-derivative of a function G(q) of the coordinates q — since such a 
time-derivative makes no difference to the Lagrange equations. 
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conserved quantities in Hamiltonian mechanics. 9 

In this exposition, I will also discuss en passant the distinction between: 

(i) the notion of symmetry at work in Noether's theorem, i.e. a symmetry of L, 
often called a variational symmetry; and 

(ii) the notion of a symmetry of the set of solutions of a differential equation: often 
called a dynamical symmetry. This notion applies to all sorts of differential equations, 
and systems of them; not just to those with the form of Lagrange's equations (i.e. 
derivable from an variational principle). In short, this sort of symmetry is a map 
that sends any solution of the given equation(s) (in effect: a dynamically possible 
history of the system — a curve in the state-space) to some other solution. Finding such 
symmetries, and groups of them, is a central part of the modern theory of integration 
of differential equations (both ordinary and partial). 

Broadly speaking, this notion is more general than that of a symmetry of L. Not 
only does it apply to many other sorts of differential equation. Also, for Lagrange's 
equations: a symmetry of L is (with one caveat) a symmetry of the solutions, i.e. a 
dynamical symmetry — but the converse is false. 10 

In this Section, the plan is as follows. We define: 

(i) : a (continuous) symmetry as a vector field (on the configuration manifold Q) 
that generates a family of transformations under which the Lagrangian is invariant; 
(Section EH ; 

(ii) : the momentum conjugate to a vector field, as (roughly) the rate of change of 
the Lagrangian with respect to the qs in the direction of the vector field; ( Section 13. 3|) . 

These two definitions lead directly to Noether's theorem ( Section I3.4|) : after all the 
stage-setting, the proof will be a one-liner application of Lagrange's equations. 

3.2 Vector fields and symmetries — variational and dynamical 

I need to expound three topics: 

(1) : the idea of a vector field on the configuration manifold Q; and how to lift it to 

(2) : the definition of a variational symmetry; 

(3) : the contrast between (2) and the idea of dynamical symmetry. 

Note that, as in previous Sections, I will often speak, for simplicity, "globally, not 
locally" , i.e. as if the relevant scalar functions, vector fields etc. are defined on all of 

9 Other expositions of Noether's theorem for finite-dimensional Lagrangian mechanics include: 
Arnold (1989: 88-89), Desloge (1982: 581-586), Lanczos (1986: 401-405: emphasizing the variational 
perspective) and Johns (2005: Chapter 13). Butterfield (2004a, Section 4.7) is a more detailed version 
of this Section. Beware: though many textbooks of Hamiltonian mechanics cover the Hamiltonian 
version of Noether's theorem (which, as we will see, is stronger), they often do not label it as such; 
and if they do label it, they often do not relate it clearly to the Lagrangian version. 

10 An excellent account of this modern integration theory, covering both ordinary and partial differ- 
ential equations, is given by Olver (2000). He also covers the Lagrangian case (Chapter 5 onwards), 
and gives many historical details especially about Lie's pioneering contributions. 
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Q or TQ. Of course, they need not be. 



3.2.1 Vector fields on TQ; lifting fields from Q to TQ 

We recall first that a differentiable vector field on Q is represented in a coordinate 
system q = (q 1 , . . . , q n ) by n first-order ordinary differential equations 

da 1 

-^ = f(q\...,q n ) • (3.1) 

A vector field generates a one-parameter family of active transformations: viz. passage 
along the vector field's integral curves, by a varying parameter-difference e. The vector 
field is called the infinitesimal generator of the family. It is common to write the 
parameter as r, but in this Section we use e to avoid confusion with t, which often 
represents the time. 

Similarly, a vector field defined on TQ corresponds to a system of 2n ordinary 
differential equations, and generates an active transformation of TQ. But I will consider 
only vector fields on TQ that mesh with the structure of TQ as a tangent bundle, in 
the sense that they are induced by vector fields on Q, in the following natural way. 

This induction has two ingredient ideas. 

First, any curve in Q (representing a possible state of motion) defines a correspond- 
ing curve in TQ, because the functions q l {t) define the functions q l (t). (Here t is the 
parameter of the curve.) More formally: given any curve in configuration space, <f> : I C 
IR — > Q, with coordinate expression in the g-system t e / i— > q(4>(t)) = q(t) = q l (t), we 
define its extension to TQ to be the curve $ : I C IR — > TQ given in the corresponding 
coordinates by q l {t),q l (t). 

Second, any vector field X on Q generates displacements in any possible state of 
motion, represented by a curve in Q with coordinate expression q % = q l {t). Namely: 
for a given value of the parameter e, the displaced state of motion is represented by 
the curve in Q 

q\t) + eX\q\t)) . (3.2) 

Putting these ingredients together: we first displace a curve within Q, and then 
extend the result to TQ. Namely, the extension to TQ of the (curve representing) 
the displaced state of motion is given by the 2n functions, in two groups each of n 
functions, for the (q, q) coordinate system 

q\t) + eX\q\t)) and q\t) + eY\q\t), g l ) ; (3.3) 

where Y is defined to be the vector field on TQ that is the derivative along the original 
state of motion of X. That is: 

Y '^^-^- (3 - 4) 
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Thus displacements by a vector field within Q are lifted to TQ. The vector field X on 
Q lifts to TQ as (X, ^); i.e. it lifts to the vector field that sends a point (q\ q l ) G TQ 
to (q* + eX\<f + e^). 11 



3.2.2 The definition of variational symmetry 



To define variational symmetry, I begin with the integral notion and then give the 
differential notion. The idea is that the Lagrangian L, a scalar L : TQ — > 1R, should 
be invariant under all the elements of a one-parameter family of active transformations 
8 t : e G / C 1R: at least in a neighbourhood of the identity map corresponding to e = 0, 
6 = idjj. (Here U is some open subset of TQ, maybe not all of it.) 

That is, we define the family 9 e : e G / C 1R to be a variational symmetry of L 
if L is invariant under the transformations: L = L o 6 e , at least around e = 0. (We 
could use the correspondence between active and passive transformations to recast this 
definition, and what follows, in terms of a passive notion of symmetry as sameness of 
L's functional form in different coordinate systems. I leave this as an exercise! Or cf. 
Butterfield (2004a: Section 4.7.2).) 

For the differential notion of variational symmetry, we of course use the idea of a 
vector field. But we also impose Section I3.2.1t s restriction to vector fields on TQ that 
are induced by vector fields on Q. So we define a vector field X on Q that generates 
a family of active transformations 8 e on TQ to be a variational symmetry of L if the 
first derivative of L with respect to e is zero, at least around e = 0. More precisely: 
writing 

BX i 

Lo6 e = L(q l + eX\ q { + tY { ) with Y l = — q j , (3.5) 

we say X is a variational symmetry iff the first derivative of L with respect to e is zero 
(at least around e = 0). That is: X is a variational symmetry iff 
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3.2.3 A contrast with dynamical symmetries 

The general notion of a dynamical symmetry, i.e. a symmetry of some equations of 
motion (whether Euler-Lagrange or not), is not needed for Section T3.4f s presentation 
of Noether's theorem. But the notion is so important that I must mention it, though 
only to contrast it with variational symmetries. 

The general definition is roughly as follows. Given any system of differential equa- 
tions, £ say, a dynamical symmetry of the system is an active transformation £ on the 

n I have discussed this in terms of some system (g, q) of coordinates. But the definitions of extensions 
and displacements are in fact coordinate- independent. Besides, one can show that the operations of 
displacing a curve within Q, and extending it to TQ, commute to first order in e: the result is the 
same for either order of the operations. 
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system £'s space of both independent variables, Xj say, and dependent variables y l 
say, such that any solution of S, y l = f l (xj) say, is carried to another solution. For 
a precise definition, cf. Olver (2000: Def. 2.23, p. 93), and his ensuing discussion of 
the induced action (called 'prolongation') of the transformation ( on the spaces of (in 
general, partial) derivatives of the y's with respect to the xs (i.e. jet spaces). 

As I said in Section 13.11 groups of symmetries in this sense play a central role in the 
modern theory of differential equations: not just in finding new solutions, once given a 
solution, but also in integrating the equations. For some main theorems stating criteria 
(in terms of prolongations) for groups of symmetries, cf. Olver (2000: Theorem 2.27, 
p. 100, Theorem 2.36, p. 110, Theorem 2.71, p. 161). 

But for present purposes, it is enough to state the rough idea of a one-parameter 
group of dynamical symmetries (without details about prolongations!) for Lagrange's 
equations in the familiar form, eq. 12.11 

In this simple case, there is just one independent variable x := t, so that: 

(a) : we are considering ordinary, not partial, differential equations, with n depen- 
dent variables y 1 := q l {t). 

(b) : prolongations correspond to lifts of maps on Q to maps on TQ; cf. Section 

mm 

Furthermore, in line with the discussion following Lagrange's equations eq. 12.11 the 
time-independence of the Lagrangian (time being a cyclic coordinate) means we can 
define dynamical symmetries ( in terms of active transformations on the tangent bun- 
dle, 9 : TQ — > TQ, that are lifted from active transformations on Q. In effect, we define 
such a map ( by just adjoining to any such 9 : TQ — > TQ the identity map on the time 
variable id : t G 1R i-> t. (More formally: ( : (q,q,t) G TQxIR h-> (%, q),t) G TQxM.) 

Then we define in the usual way what it is for a one-parameter family of such maps 
Cs : s G / C 1R to be a (local) one-parameter group of dynamical symmetries (for 
Lagrange's equations eq. I2.1J1 : namely, if any solution curve q(t) (equivalently: its 
extension q(t),q(t) to TQ) of the Lagrange equations is carried by each ( s to another 
solution curve, with the ( s for different s composing in the obvious way, for s close 
enough to G I. 

And finally: we also define (in a manner corresponding to the discussion at the 
end of Section 13.2.2)) a differential, as against integral, notion of dynamical symme- 
try. Namely, we say a vector field X on Q is a dynamical symmetry if its lift to TQ 
(more precisely: its lift, with the identity map on the time variable adjoined) is the 
infinitesimal generator of such a one-parameter family ( s . 

For us, the important point is that this notion of a dynamical symmetry is different 
from Section r3.2.2F s notion of a variational symmetry. 12 As I announced in Section I3~TI 
a variational symmetry is (with one caveat) a dynamical symmetry — but the converse 
is false. Fortunately, the same simple example will serve both to show the subtlety 

12 Since the Lagrangian L is especially associated with variational principles, while the dynamics is 
given by equations of motion, calling Section l3.2.2l s notion 'variational symmetry', and this notion 
'dynamical symmetry' is a good and widespread usage. But beware: it is not universal. 
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about the first implication, and as a counterexample to the converse implication. This 
example is the two-dimensional harmonic oscillator. 13 

The usual Lagrangian is, with cartesian coordinates written as qs, and the con- 
travariant indices written for clarity as subscripts: 

L^^ql + ql - u 2 {ql + ql)] ; (3.7) 
giving as Lagrange equations: 

q\ + uu\ = , i = 1, 2. (3.8) 

But these Lagrange equations, i.e. the same dynamics, are also given by 

L 2 = qih ~ ^V<?2 • (3.9) 

The rotations in the plane are of course a variational symmetry of Li, and a dynamical 
symmetry of eq. 13.81 But they are not a variational symmetry of L 2 . So a dynamical 
symmetry need not be a variational one. Besides, these equations contain another 
example to the same effect. Namely, the "squeeze" transformations 

q[-=e v qi, q' 2 :=e-y 2 (3.10) 

are a dynamical symmetry of eq. 13.81 but not a variational symmetry of L\. So again: 
a dynamical symmetry need not be a variational one. 14 

I turn to the first implication: that every variational symmetry is a dynamical 
symmetry. This is true: general and abstract proofs (applying also to continuous 
systems i.e. field theories) can be found in Olver (2000: theorem 4.14, p. 255; theorem 
4.34, p. 278; theorem 5.53, p. 332). 

But beware of a condition of the theorem. (This is the caveat mentioned at the end 
of Section ETT1 ) The theorem requires that all the variables q (for continuous systems: 
all the fields 0) be subject to Hamilton's Principle. The need for this condition is shown 
by rotations in the plane, which are a variational symmetry of the familiar Lagrangian 
Li above. But it is easy to show that such a rotation is a dynamical symmetry of one 
of the Lagrange equations, say the equation for the variable qi 

q\+u 2 qi =0 , (3.11) 

only if the corresponding Lagrange equation holds for q 2 . 

13 All the material to the end of this Subsection is drawn from Brown and Holland (2004a); cf. also 
their (2004). The present use of the harmonic oscillator example also occurs in Morandi et al (1990: 
203-204). 

14 In the light of this, you might ask about a more restricted implication: viz. must every dynamical 
symmetry of a set of equations of motion be a variational symmetry of some or other Lagrangian 
that yields the given equations as the Euler-Lagrange equations of Hamilton's Principle? Again, the 
answer is No for the simple reason that there are many (sets of) equations of motion that are not 
Euler-Lagrange equations of any Lagrangian, and yet have dynamical symmetries. 

Wigner (1954) gives an example. The general question of under what conditions is a set of ordinary 
differential equations the Euler-Lagrange equations of some Hamilton's Principle is the inverse problem 
of Lagrangian mechanics. It is a large subject with a long history; cf. e.g. Santilli (1979), Lopuszanski 
(1999). 



3.3 The conjugate momentum of a vector field 



Now we define the momentum conjugate to a vector field X to be the scalar function 
on TQ: 

BL 

p x :TQ^M ; Px = J] t X i — (3.12) 

Bq l 

(For a time-dependent Lagrangian, px would be a scalar function on TQ x 1R, with 1R 
representing time.) 

We shall see in the next Subsection's examples that this definition generalizes in an 
appropriate way Section l2~Tl s definition of the momentum conjugate to a coordinate q. 

But first note that it is an improvement in the sense that, while the momentum 
conjugate to a coordinate q depends on the choice made for the other coordinates, the 
momentum px conjugate to a vector field X is independent of the coordinates chosen. 
Though this point is not needed in order to prove Noether's theorem, here is the proof. 

We first apply the chain-rule to L — L(q'(q), q'(q, q)) and eq. 12.111 ("cancellation of 
the dots"), to get 

dL =Tj 9L^ = S BL_B_f_ 

dqi J dq'i dq { 3 Bq'i dq i 1 ' ' 

Then using the transformation law for components of a vector field 

X* = £, d -^-XK (3.14) 

and relabelling i and j, we deduce: 



Px = X'^ = E„ X^^ = E y X^^ = Sj X^ S p x . (3.15) 
dq n 3 dqi dq n 3 dq l dq'3 dq l ' 



Finally, I remark incidentally that in the geometric formulation of Lagrangian me- 
chanics (Section 12. 2 j) , the coordinate-independence of px becomes, unsurprisingly, a 
triviality. Namely: px is obviously the contraction of X as lifted to TQ with the 
canonical 1-form on TQ that we defined in eq. 12.131 

BL 

6 L := —dt ■ (3.16) 
We will return to this at the end of Section 13.4.11 



3.4 Noether's theorem; and examples 

Given just the definition of conjugate momentum, eq. 13.121 the proof of Noether's 
theorem is immediate. (The interpretation and properties of this momentum, discussed 
in the last Subsection, are not needed.) The theorem says: 
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Noether's theorem for Lagrangian mechanics If X is a (variational) 
symmetry of a system with Lagrangian L(q,q,t), then X's conjugate mo- 
mentum is a constant of the motion. 



Proof: We just calculate the derivative of the momentum eq. 13.121 along the solu- 
tion curves in TQ, and apply Lagrange's equations and the definitions of Y\ and of 
symmetry eq. 13.61 

?^,?s^4fa (3.17) 



dt dt dq 1 dt \dq 

= E ' Y 'W + E ' X 'W =0 - QED ' 

Examples: — This proof, though neat, is a bit abstract! So here are two examples, 
both of which return us to examples we have already seen. 

(1): The first example is a shift in a cyclic coordinate q n : i.e. the case with which 
our discussion of Noether's theorem began at the end of Section 12.11 So suppose q n is 
cyclic, and define a vector field X by 

X 1 = 0,...,X n ~ 1 = 0, X n = 1 . (3.18) 

So the displacements generated by X are translations by an amount e in the in- 
direction. Then Y % := vanishes, and the definition of (variational) symmetry 
eq. 13.61 reduces to 

— = . 3.19 

dq n v 1 

So since q n is assumed to be cyclic, X is a symmetry. And the momentum conjugate 
to X, which Noether's theorem tells us is a constant of the motion, is the familiar one: 

«* = s - *w = w- (3 ' 20) 

As mentioned in Section 13.11 this example is universal, in that every symmetry X 
arises, around any point where X is non-zero, from a cyclic coordinate in some local 
system of coordinates. This follows from the basic theorem about the local existence 
and uniqueness of solutions of ordinary differential equations. We can state the theorem 
as follows; (cf. e.g. Arnold (1973: 48-49, 77-78, 249-250), Olver (2000: Prop 1.29)). 

Consider a system of n first-order ordinary differential equations on an open subset 
U of an n-dimensional manifold 

j = X i {q) = X i (q\...,q n ) , q G U ; (3.21) 

equivalently, a vector field X on U. Let go be a non-singular point of the vector field, 
i.e. X(qo) ^ 0. Then in a sufficiently small neighbourhood V of go, there is a coordinate 

i s 



system (formally, a diffeomorphism / : V —>■ W C IR 71 ) such that, writing yi : IR n — > 1R 
for the standard coordinates on W and ej for the ith standard basic vector of IR n , eq. 
13.211 goes into the very simple form 

y = e n ; i.e. y n = 1, y t = y 2 = ■ ■ ■ = y n -i = in W . (3.22) 

(In terms of the tangent map (also known as: push-forward) /* on tangent vectors that 
is induced by /: f*(X) = e n in W.) On account of eq. 13.221 s simple form, Arnold 
suggests the theorem might well be called the 'rectification theorem'. 

We should note two points about the theorem: 

(1) : The rectifying coordinate system / may of course be very hard to find. So the 
theorem by no means makes all problems "trivially soluble" ; cf. again footnote 4. 

(ii): The theorem has an immediate corollary about local constants of the motion. 
Namely: n first-order ordinary differential equations have, locally, n — 1 functionally 
independent constants of the motion (also known as: first integrals). They are given, 
in the above notation, by y\, . . . , y n -\. 

We now apply the rectification theorem, so as to reverse the reasoning in the above 
example of q n cyclic. That is: assuming X is a symmetry, let us rectify it — i.e. let us 
pass to a coordinate system (q) such that eq. 13.181 holds. Then, as above, Y % := 
vanishes; and X's being a (variational) symmetry, eq. 13.61 reduces to q n being cyclic; 
and the momentum conjugate to X, px reduces to the familiar conjugate momentum 
Pn = Jp-- Thus every symmetry X arises locally from a cyclic coordinate q n and the 
corresponding conserved momentum is p n . (But note that this may hold only "very 
locally" : the domain V of the coordinate system / in which X generates displacements 
in the direction of the cyclic coordinate q n can be smaller than the set U on which X 
is a symmetry.) 

In Section the fact that every symmetry arises locally from a cyclic coordinate 
will be important for understanding the Hamiltonian version of Noether's theorem. 

(2) : Let us now look at our previous example, the angular momentum of a free 
particle (eq. I2.6J) . in the cartesian coordinate system, i.e. a coordinate system without 
cyclic coordinates. So let q\ := x, q^ := y, <?3 := z. (In this example, subscripts will 
again be a bit clearer.) Then a small rotation about the a;- axis 



corresponds to a vector field X with components 



so that the Y{ are 
For the Lagrangian 



Sx = 0, 8y — —ez, 5z = ey 

X 1 = 0, X 2 = -q 3 , X 3 = q 2 
Y 1 = 0, Y 2 = -q 3 , X 3 = q 2 . 
L = ^m{ql + q 2 2 + qj) 



(3.23) 

(3.24) 
(3.25) 
(3.26) 



X is a (variational) symmetry since the definition of symmetry eq. 13 .61 now reduces to 

dL dL dL dL , 

Sj X,— + Y t — = -q 3 — + q 2 — = . 3.27 

oqi dq t dq 2 dq 3 

So Noether's theorem then tells us that X's conjugate momentum is 

dL dL dL 

p x := X— = X 2 — + X 3 — = -mzy + (3.28) 
% dq 2 dq 3 

which is indeed the x-component of angular momentum. 



3.4.1 A geometrical formulation 

We can give a geometric formulation of Noether's theorem by using the vanishing of 
the Lie derivative to express constancy along the integral curves of a vector field. There 
are two vector fields on TQ to consider: the dynamical vector field D (cf. eq. 12. 8j) . 
and the lift to TQ of the vector field X that is the variational symmetry. 

I will now write X for this lift. So given the vector field X on Q 

X = X*{q)±, (3.29) 
the lift X of X to TQ is, by eq. EH 

where the q argument of X 1 emphasises that the X 1 do not depend on q. 

That X is a variational symmetry means that in TQ, the Lie derivative of L along 
the lift X vanishes: C X L = 0. On the other hand, we know from eq. 13.161 that the 
momentum px conjugate to X is the contraction <; > of X with the canonical 1-form 
9 L :=§dq* on TQ: 

dL 

p x := X 1 — =<X;6 L > . (3.31) 



So Noether's theorem says: 

If C X L = 0, then C D < X; 9 L >= 0. 

Note finally that eq. 13.311 shows that the theorem has no converse. That is: given 
that a dynamical variable p : TQ —>■ 1R is a constant of the motion, CdP = 0, there 
is no single vector field X on TQ such that p =< X;6l >. For given such a X, one 
could get another by adding any field Y for which < Y; 8l >= 0. However, we will see 
in Section I5~2l that in Hamiltonian mechanics a constant of the motion does determine 
a corresponding vector field on the state space. 

on 



4 Hamiltonian mechanics introduced 



4.1 Preamble 

From now on this paper adopts the Hamiltonian framework. As we shall see, its de- 
scription of symmetry and conserved quantities is in various ways more straightforward 
and powerful than that of the Lagrangian framework. 

The main idea is to replace the qs by the canonical momenta, the ps. More gener- 
ally, the state-space is no longer the tangent bundle TQ but a phase space T, which we 
take to be the cotangent bundle T*Q. (Here, the phrase 'we take to be' just signals the 
fact that eventually, in Section RTIHl we will glimpse a more general kind of Hamiltonian 
state-space, viz. Poisson manifolds.) 

Admittedly, the theory on TQ given by Lagrange's equations eq. 12.11 is equivalent 
to the Hamiltonian theory on T*Q given by eq. 14.51 below, once we assume the Hessian 
condition eq. 12.31 

But of course, theories can be formally equivalent, but different as regards their 
power for solving problems, their heuristic value and even their interpretation. In our 
case, two advantages of Hamiltonian mechanics over Lagrangian mechanics are com- 
monly emphasised, (i): The first concerns its greater power or flexibility for describing 
a given system, that Lagrangian methods can also describe (and so its greater power 
for solving problems about such a system), (ii): The second concerns the broader idea 
of describing other systems. In more detail: — 

(i) : Hamiltonian mechanics replaces the group of point transformations, q — > q' on 
Q, together with their lifts to TQ, by a "corresponding larger" group of transforma- 
tions on T, the group of canonical transformations (also known as, for the standard 
case where T = T*Q: the symplectic group). 

This group "corresponds" to the point transformations (and their lifts) in that 
while for any Lagrangian L, Lagrange's equations eq. 12.11 are covariant under all the 
point transformations, Hamilton's equations eq. 14.51 below are (for any Hamiltonian 
H) covariant under all canonical transformations. And it is a "larger" group because: 

(a) any point transformation together with its lift to TQ is a canonical transforma- 
tion: (more precisely: it naturally defines a canonical transformation on T*Q); 

(b) not every canonical transformation is thus induced by a point transformation; 
for a canonical transformation can "mix" the gs and ps in a way that point transfor- 
mations and their lifts cannot. 

There is a rich and multi-faceted theory of canonical transformations, to which 
there are three main approaches — generating functions, integral invariants and sym- 
plectic geometry. I will adopt the symplectic approach, but not need many details 
about it. In particular, we will need only a few details about how the "larger" group 
of canonical transformations makes for a more powerful version of Noether's theorem. 

(ii) : The Hamiltonian framework connects analytical mechanics with other fields 
of physics, especially statistical mechanics and optics. The first connection goes via 
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canonical transformations, especially using the integral invariants approach. The sec- 
ond connection goes via Hamilton- Jacobi theory; (for a philosopher's exposition, with 
an eye on quantum theory, cf. Butterfield (2004b: especially Sections 7-9)). 15 

With its theme of symmetry and conservation, this paper will illustrate (i), greater 
power in describing a given system, rather than (ii), describing other systems. As to 
(i), we will see two main ways in which the Hamiltonian framework is more powerful 
than the Lagrangian one. First, cyclic coordinates will "do more work for us" (Section 
14. 2j) . Second, the Hamiltonian version of Noether's theorem is both: more powerful, 
thanks to the use of the "larger" group of canonical transformations; and more easily 
proven, thanks to the use of Poisson brackets (Section 

So from now on, the broad plan is as follows. After Section l4~21 s deduction of Hamil- 
ton's equations, Section 1431 introduces symplectic structure, starting from the "naive" 
form of the symplectic matrix. Section presents Poisson brackets, and the Hamil- 
tonian version of Noether's theorem. Finally, Section El gives a geometric perspective, 
corresponding to Section I2~21 s geometric perspective on the Lagrangian framework. 



4.2 Hamilton's equations 
4.2.1 The equations introduced 

Recall the vision in (5) of Section 12.2.21 that we seek 2n new variables, £ a say, a = 
1, 2n in which Lagrange's equations take the simple form 

% = ut\-e n ). (4.i) 

We can find the desired variables £ a by using the canonical momenta 

dL 

dq~ 

to write the 2n Lagrange equations as 



Pi : = ='■ L ? > ( 4 - 2 ) 



dpi dL dq % 
dt dq % ' dt 



(4.3) 



These are of the desired simple form, except that the right hand sides need to be written 
as functions of (q, p, t) rather than (q, q, t) . (Here and in the next two paragraphs, we 
temporarily allow time-dependence, since the deduction is unaffected: the time variable 
is "carried along unaffected" . In the terms of Section I2.1| this means allowing non- 
scleronomous constraints and a time-dependent work-function U.) 

15 Of course, some aspects of Hamiltonian mechanics illustrate both (i) and (ii). For example, 
Liouville's theorem on the preservation of phase space volume illustrates both (i)'s integral invariants 
approach to canonical transformations and (ii) 's connection to statistical mechanics. 
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For the second group of n equations, this is in principle straightforward, given our 
assumption of a non-zero Hessian, eq. 12.31 This implies that we can invert eq. 14.21 so 
as to get the n q l as functions of (q, p, t) . We can then apply this to the first group of 
equations; i.e. we substitute <f(g,p, £) wherever q l appears in any right hand side 

But we need to be careful: the partial derivative of L(q,q,t) with respect to q l is 
not the same as the partial derivative of L(q,p,t) := L(q,q(q,p,t),t) with respect to 
g\ since the first holds fixed the gs, while the second holds fixed the ps. A comparison 
of these partial derivatives leads, with algebra, to the result that if we define the 
Hamiltonian function by 

H(q, p,t):= pitfiq, p, t) - L(q, p, t) (4.4) 
then the 2n equations eq. 14.31 go over to Hamilton's equations 

dpi dH dq l dH 



dt dq i ' dt dpi 



(4.5) 



So we have cast our 2n equations in the simple form, = / a (^ 1 , ...£ 2n ), requested in 
(5) of Section l2~2l More explicitly: defining 

C = Q a , a = l,...,n ; £ a = p a _ n , a = n + l,...,2n (4.6) 

Hamilton's equations become 

an f)fJ 

^ = 7- — , a = l,...,n ; t = — , a = n + 1, 2n . (4.7) 

To sum up: a single function H determines, through its partial derivatives, the evolu- 
tion of all the gs and ps — and so, the evolution of the state of the system. 



4.2.2 Cyclic coordinates in the Hamiltonian framework 

Just from the form of Hamilton's equations, we can immediately see a result that 
is significant for our theme of how symmetries and conserved quantities reduce the 
number of variables involved in a problem. In short, we can see that with Hamilton's 
equations in hand, cyclic coordinates will "do more work for us" than they do in the 
Lagrangian framework. 

More specifically, recall the basic Lagrangian result from the end of Section |2~T1 that 
the generalized momentum p n := Jpj is conserved if, indeed iff, its conjugate coordinate 

q n is cyclic, = 0. And recall from Section EPI that this result underpinned Noether's 
theorem in the precise sense of being "universal" for it. Corresponding results hold in 
the Hamiltonian framework — but are in certain ways more powerful. 

Thus we first observe that the transformation "from the gs to the ps", i.e. the 
transition between Lagrangian and Hamiltonian frameworks, does not involve the de- 
pendence on the gs. More precisely: partially differentiating eq. 14.41 with respect to 
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g n , we obtain 

dH dH , dL dL 



dq n ~ dg n Q q n ~ Q q n " ( 4i 

(The other two terms are plus and minus Pijpz, and so cancel.) So a coordinate g n 
that is cyclic in the Lagrangian sense is also cyclic in the obvious Hamiltonian sense, 
viz. that Tp| = 0. But by Hamilton's equations, this is equivalent to p n = 0. So we 
have the result corresponding to the Lagrangian one: p n is conserved iff q n is cyclic (in 
the Hamiltonian sense). 

We will see in Section 15.31 that this result underpins the Hamiltonian version of 
Noether's theorem; just as the corresponding Lagrangian result underpinned the La- 
grangian version of Noether's theorem (cf. discussion after eq. 13.20)1 . 

But we can already see that this result gives the Hamiltonian formalism an advan- 
tage over the Lagrangian. In the latter, the generalized velocity corresponding to a 
cyclic coordinate, q n will in general still occur in the Lagrangian. The Lagrangian will 
be L(qi, . . . , q n -i, qi, ■ ■ ■ , q n , t), so that we still face a problem in n variables. 

But in the Hamiltonian formalism, p n will be a constant of the motion, a say, so 
that the Hamiltonian will be H(qx, . . . , q n -i,Pi, ■ ■ ■ ,Pn-i, ot, t). So we now face a prob- 
lem in n — 1 variables, a being simply determined by the initial conditions. That is: 
after solving the problem in n — 1 variables, q n is determined just by quadrature: i.e. 
just by integrating (perhaps numerically) the equation 

i. = f , (4.9) 

where, thanks to having solved the problem in n — 1 variables, the right-hand side is 
now an explicit function of t. 

This result is very simple. But it is an important illustration of the power of the 
Hamiltonian framework. Indeed, Arnold remarks (1989: 68) that 'almost all the solved 
problems in mechanics have been solved by means of it! 

No doubt his point is, at least in part, that this result underpins the Hamiltonian 
version of Noether's theorem. But I should add that the result also motivates the 
study of various notions related to the idea of cyclic coordinates, such as constants of 
the motion being in involution (i.e. having zero Poisson bracket with each other), and 
a system being completely integrable (in the sense of Liouville). These notions have 
played a large part in the way that Hamiltonian mechanics has developed, especially 
in its theory of canonical transformations. And they play a large part in the way 
Hamiltonian mechanics has solved countless problems. But as announced in Section 
14.1) this paper will not go into these aspects of Hamiltonian mechanics, since they are 
not needed for our theme of symmetry and conservation; (for a philosophical discussion 
of these aspects, cf. Butterfield 2005). 
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4.2.3 The Legendre transformation and variational principles 

To end this Subsection, I note two aspects of this transition from Lagrange's equations 
to Hamilton's. For, although I shall not need details about them, they each lead to a 
rich theory: 

(i) : The transformation "from the qs to the ps" is the Legendre transformation. It 
has a striking geometric interpretation. In the simplest case, it concerns the fact that 
one can describe a smooth convex real function y = f(x), f"(x) > 0, not by the pairs 
of its arguments and values (x,y), but by the pairs of its gradients at points (x, y) 
and the intercepts of its tangent lines with the ?/-axis. Given the non-zero Hessian (eq. 
12. 3J) . one readily proves various results: e.g. that the geometric interpretation extends 
to higher dimensions, and that the transformation is self-inverse, i.e. its square is the 
identity. For details, cf. e.g.: Arnold (1989: Chapters 3.14, 9.45.C), Courant and 
Hilbert (1953: Chapter IV.9.3; 1962, Chapter 1.6), Jose and Saletan (1998: 212-217), 
Lanczos (1986: Chapter VI. 1-4). The Legendre transformation is also described using 
modern geometry's idea of a fibre derivative; as we will see briefly in Section I6~7I 

(ii) : The transition to Hamilton's equations has achieved more than we initially 
sought with our eq. 14.11 Namely: all the f a , all the right hand sides in Hamilton's 
equations, are up to a sign, partial derivatives of a single function H. In the Hamilto- 
nian framework, it is precisely this feature that underpins the possibility of expressing 
the equations of motion by variational principles; (of course, the Lagrangian frame- 
work has a corresponding feature). But as I mentioned, this paper does not discuss 
variational principles; for details cf. e.g. Lanczos (1986: Chapter VI. 4) and Butterfield 
(2004: especially Section 5.2). 

To sum up this introduction to Hamilton's equations: — Even once we set aside 
(i) and (ii), these equations mark the beginning of a rich and multi-faceted theory. 
At the centre lies the 2n-dimensional phase space T coordinatized by the qs and ps: 
or more precisely, as we shall see later, the cotangent bundle T*Q. The structure of 
Hamiltonian mechanics is encoded in the structure of T, and thereby in the coordinate 
transformations on T that preserve this structure, especially the form of Hamilton's 
equations: the canonical transformations. As I mentioned in Section I4~T| these trans- 
formations can be studied from three main perspectives: generating functions, integral 
invariants and symplectic structure — but I shall only need the last. 

4.3 Symplectic forms on vector spaces 

I shall introduce symplectic structure by giving Hamilton's equations a yet more sym- 
metric appearance. This will lead to some elementary ideas about area in IR m and 
symplectic forms on vector spaces: ideas which will later be "made local" by taking 
the relevant copy of K m to be the tangent space at a point of a manifold. (As usually 
formulated, Hamiltonian mechanics is especially concerned with the case m = 2n.) 
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4.3.1 Time-evolution from the gradient of H 



Writing 1 and for the n x n identity and zero matrices respectively, we define the 
2n x 2n symplectic matrix uj by 

( °, : : <«q 

uj is antisymmetric, and has the properties, writing ~ for the transpose of a matrix, 
that 

uj = —uj = uj^ 1 so that uj 2 = — 1 ; also det uj — 1. (4- 11) 

Using uj, Hamilton's equations eq. 14 .71 get the more symmetric form, in matrix notation 

i = uj 8 4- ■ (4.12) 

In terms of components, writing uj a/3 for the matrix elements of uj, and defining d a : = 
d /<9£°, eq. 14.71 become 

^ = oj al3 dpH. (4.13) 

Eq. 14.121 and 14.131 show how uj forms, from the naive gradient (column vector) VH 
of H on the phase space T of gs and ps, the vector field on T that gives the system's 
evolution: the Hamiltonian vector field, often written Xh- At a point z = (q,p) G T, 
eq. 14.121 can be written 

X H (z) =ujVH{z). (4.14) 

The vector field Xh is also written as D (for 'dynamics'), on analogy with the La- 
grangian framework's vector field D of eq. 12.81 in Section 12.21 

In Section we will see how this definition of a vector field from a gradient, i.e. a 
covector or 1-form field, arises from Ps being a cotangent bundle. More precisely, we 
will see that any cotangent bundle has an intrinsic symplectic structure that provides, 
at each point of the base-manifold, a natural i.e. basis-independent isomorphism be- 
tween the tangent space and the cotangent space. For the moment, we: 

(i) note a geometric interpretation of uj in terms of area ( Section 14.3.2}) ; and then 

(ii) generalize the above discussion of uj into the definition of a symplectic form for 
a fixed vector space (Section |4.3.3|) . 

4.3.2 Interpretation in terms of areas 

Let us begin with the simplest possible case: 1R 2 3 (q,p), representing the phase space 
of a particle constrained to one spatial dimension. Here, the 2x2 matrix 

» ■■=(-! I) < 415 » 
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defines the antisymmetric bilinear form on IR 2 : 



A : ((q\pi), (g 2 ,p 2 )) G H 2 x ]R 2 t-> q l p 2 - q 2 Pl G IR (4.16) 

since 

It is easy to prove that A^q 1 , pi) , (q 2 , p 2 )) = q 1 p2 — q 2 Pi is the signed area of the 
parallelogram spanned by (q 2 ,P2), where the sign is positive (negative) if the 

shortest rotation from (q l ,p±) to (q 2 ,P2) is anti-clockwise (clockwise). 

Similarly in ]R 2n : the matrix 10 of eq. 14 .1UI defines an antisymmetric bilinear form 
on ]R 2n whose value on a pair (q,p) = (q 1 , ...q n ;pi, ...,p n ), (q',p') = (q n , ■■■q' n 'iPi, ■■■■,p' n ) 
is the sum of the signed areas of the n parallelograms formed by the projections of the 
vectors (q,p), {q',p') onto the n pairs of coordinate planes labelled 1, ...,n. That is to 
say, the value is: 

Zti Mi ~ q H Pi ■ (4-18) 

This induction of bilinear forms from antisymmetric matrices can be generalized: 
there is a one-to-one correspondence between forms and matrices. In more detail: 
there is a one-to-one correspondence between antisymmetric bilinear forms on IR 2 and 
antisymmetric 2x2 matrices. It is easy to check that any such form, u say, is given, for 

any basis v, w of IR 2 , by the matrix ( _^ ( ° w) U & w) ) . Similarly for any integer 

n: one easily shows that there is a one-to-one correspondence between antisymmetric 
bilinear forms on IR™ and antisymmetric n x n matrices. (In Hamiltonian mechanics as 
usually formulated, we consider the case where n is even and the matrix is non-singular, 
as in eq. 14.101 But when one generalizes to Poisson manifolds (cf. Section I6.8J) one 
allows n to be odd, and the matrix to be singular.) 

This geometric interpretation of u is important for two reasons. 

(i) : The first reason is that the idea of an antisymmetric bilinear form on a copy of 
IR 2 " is the main part of the definition of a symplectic form, which is the central notion 
in the usual geometric formulation of Hamiltonian mechanics. More details in Section 
14.3.31 for a fixed copy of IR 2 ™; and in Section El where the form is defined on many 
copies of IR 2n , each copy being the tangent space at a point in the cotangent bundle 
T*Q. 

(ii) : The second reason is that the idea of (signed) area underpins the theory of 
forms (1-forms, 2-forms etc.): i.e. antisymmetric multilinear functions on products of 
copies of IR™ . And when these copies of IR™ are copies of the tangent space at (one and 
the same) point in a manifold, these forms lead to the whole theory of integration on 
manifolds. One needs this theory in order to make rigorous sense of any integration on 
a manifold beyond the most elementary (i.e. line- integrals); so it is crucial for almost 
any mathematical or physical theory using manifolds. In particular, it is crucial for 
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Hamiltonian mechanics. So no wonder the maestro says that 'Hamiltonian mechanics 
cannot be understood without differential forms' (Arnold 1989, p. 163). 

However, it turns out that this paper will not need many details about forms and the 
theory of integration. This is essentially because we focus only on solving mechanical 
problems, and simplifying them by appeals to symmetry. This means we will focus 
on line-integrals: viz. integrating with respect to time the equations of motion; or 
equivalently, integrating the dynamical vector field on the state space. We have already 
seen this vector field as Xh in eq. 14.141 and we will see it again, for example in terms 
of Poisson brackets (eq. 15. 14)1 . and in geometric terms (Section EJ). But throughout, 
the main idea will be as suggested by eq. 14.141 the vector field is determined by the 
symplectic matrix, "at" each point in the manifold T, acting on the gradient of the 
Hamiltonian function H. 

So in short: focussing on line- integrals enables us to side-step most of the theory 
of forms. 16 

4.3.3 Bilinear forms and associated linear maps 

We now generalize from the symplectic matrix wtoa symplectic form; in five extended 
comments. 

(1): Preliminaries: — 

Let V be a (real finite-dimensional) vector space, with basis ex, e*, ...e n . We write 
V* for the dual space, and e 1 , e\ ...e n for the dual basis: e l (ej) := <5*. 

We recall that the isomorphism h- > e l is basis-dependent: for a different basis, 
the corresponding isomorphism would be a different map. Only with the provision of 
appropriate extra structure would this isomorphism be basis- independent. 

For physicists, the most familiar example of such a structure is the spacetime metric 
g in relativity theory. In terms of components, this basis-independence shows up in 
the way that g and its inverse lower and raise indices. As we will see in a moment, the 
underlying mathematical point is that because g is a bilinear form on a vector space 
V, i.e. g : V x V — > H, and is non-degenerate, any v G V defines, independently of 
any choice of basis, an element of V*: viz. the map u G V i— > g(u,v). (In fact, V is 
the tangent space at a spacetime point; but this physical interpretation is irrelevant 
to the mathematical argument.) We will also see that Hamiltonian mechanics has 
a non-degenerate bilinear form, viz. a symplectic form, that similarly gives a basis- 
independent isomorphism between a vector space and its dual. (Roughly speaking, this 
vector space will be the In- dimensional space of the qs and ps.) 

On the other hand: for any vector space V, the isomorphism between V and V** 
given by 

e t i — ► [ej G V** : e j G V* i— > e J '(e;) = 5f (4.19) 

16 But forms are essential for understanding integration over surfaces of dimension two or more: 
which one needs for the integral invariants approach to Hamiltonian mechanics, and its deep connection 
with Stokes' theorem. 



is basis-independent, and so we identify ej with [e^], and V with V**. We will write 

< ; > (also written < , >) for the natural pairing (in either order) of V and V*: e.g. 

< ti ; e 3 > = < e 3 ; ; > = 5|. 

A linear map A : V ^ W induces (basis- independently) a transpose (aka: dual), 
written i (or A T or A*), A : W* V* by 

Va G W 7 * ,Vv &V : A(a)(v) = < A(a) ; v > := a(A(v)) = (a o . (4.20) 

If A : V — > W is a linear map between real finite-dimensional vector spaces, its 
matrix with respect to bases ei, e,, ...e n and ft, fj, ■■■f m of V and is given by: 

Afa) = Aifj ■ i.e. with v = v^i, (A(v)) j = A{v l . (4.21) 

So the upper index labels rows, and the lower index labels columns. Similarly, if 
A : V x W — >]Risa bilinear form, its matrix for these bases is defined as 

Aij := A(ei, fj) (4.22) 

so that on vectors v = v l 6i,w = w 3 fj, we have: A(v,w) = v i A^w 3 . 

(2) : Associated maps and forms: — 

Given a bilinear form A : V x W — > H, we define the associated linear map A : V — > W* 
by 

#(w)(w) := . (4.23) 

Then A b (ei) = Aijf 3 : for both sides send any w = w 3 fj to AijW 3 . That is: the matrix 
of A b in the bases e i: f 3 of V and W* is A^: 

[A b ],, = Mr (4.24) 

On the other hand, we can proceed from linear maps to associated bilinear forms. 
Given a linear map B : V — > W*, we define the associated bilinear form on V x 
W** xWby 

B s (v,w) = < B(v) ; w > . (4.25) 

If we put A b for B in eq. 14.251 its associated bilinear form, acting on vectors v = 
v l e i) w = w 3 fj, yields, by eq. 14.231 

(A b f(v,w) =<A\v)-w>= A(v,w) . (4.26) 

One similarly shows that if B : V — > W*, then \/w G W: 

(B*)\v)(w) =< (B^)\v) ; w > = B(v)(w) =< B(v) ; w > so that (5 tt ) b = B . 

(4.27) 

So the flat and sharp operations, b and are inverses. 

(3) : Tensor products: — 

It will sometimes be helpful to put the above ideas in terms of tensor products. If 
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v G V,w G W, we can think of v and w as elements of v**,W** respectively. So 
we define their tensor product as a bilinear form on V* x W 7 * by requiring for all 

aeV*,/3e W*: 

(v®w)(a,/3) := v(a)w((3) = < v ; a >< w ; (3 > . (4.28) 

Similarly for other choices of vector spaces or their duals. Given a E V* , f3 E W*, their 
tensor product is a bilinear form on V x W: 

(a <g> (3){y,w) := a(v)/3(w) = <v; a >< w ; f3 > . (4.29) 

Similarly, we can think of a G V* , w G W as elements of V* and IV** respectively, and 
so define their tensor product as a bilinear form on V x W*: 

(a (g) w)(v, (3) := a(v)w(/3) = <v; a >< w ; f3 > . (4.30) 

In this way we can express the linear map A : V — > W in terms of tensor products. 
Since 

A{e i )=Mf j iff <A( ei );f> = A\ (4.31) 

eq. 14 .301 implies that 

A = A\ e i ® fj . (4.32) 

Similarly, a bilinear form A : V x W — > 1R with matrix := A(e{, fj) (cf. eq. I4.22|) 
is: 

A = A ij e i ®f j (4.33) 

The definitions of tensor product eq. I4.28( 14.291 and I4.3UI generalize to higher-rank 
tensors (i.e. multilinear maps whose domains have more than two factors). But we 
will not need these generalizations. 

(4): Antisymmetric and non- degenerate forms: — 
We now specialize to the forms and maps of central interest in Hamiltonian mechanics. 
We take W = V, dim(V)=n, and define a bilinear form u : V x V — ► 1R to be: 

(i) : antisymmetric iff: u(v,v') = —u(v,v'); 

(ii) : non- degenerate iff: if u)(v,v') = W G V, then v = 0. 

The form uj and its associated linear map : V — > V* now have a square matrix ujij 
(cf. eq. 14.24)1 . We define the rank of u to be the rank of this matrix: equivalently, the 
dimension of the range ^(V). 

We will also need the antisymmetrized version of eq. 14.291 that is definable when 
W = V. Namely, we define the wedge-product of a, (3 G V* to be the antisymmetric 
bilinear form on V, given by 

aA/3 : (v,w) G V x V i-> (a(u))(/3(«;)) - (a(w))(/3(v)) G IR . (4.34) 

(The connection with Section l4.3.2) especially eq. 14.181 will become clear in a moment; 
and will be developed in Section Id^ A.) 
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It is easy to show that for any bilinear form u : V x V — > IR: u is non-degenerate 
iff the matrix ujy is non-singular iff uj . V — > V* is an isomorphism. 

So a non-degenerate bilinear form establishes a basis-independent isomorphism be- 
tween V and V*; cf. the discussion of the spacetime metric g in (1) at the start of this 
Subsection. 

Besides, this isomorphism oj° has an inverse, suggesting another use of the sharp 
notation, viz. is defined to be (uj b )~ 1 : V* — > V . The isomorphism : V* — > V 
corresponds to uj's role, emphasised in Section l4.3.1l of defining a vector field Xh from 
dH . (But we will see in a moment that the space V implicitly considered in Section 
14.3.11 had more structure than being just any finite-dimensional real vector space: viz. 
it was of the form W x W* .) 

NB: This definition of " is of course not equivalent to our previous definition, in 
eq. 14.251 since: 

(i) : on our previous definition, " carried a linear map to a bilinear form, which 
reversed the passage by b from bilinear form to linear map, in the sense that for a 
bilinear form u, we had (c^)» = uo- cf. eq. Q2 

(ii) : on the present definition, * carries a bilinear form u : V x V ^> Wi to a. linear 
map uj^ : V* — > V, which inverts b in the sense {different from (i)) that 

a;" o u 9 = idy and J* o cu" = id v * ■ (4.35) 

So beware: though not equivalent, both definitions are used! But it is a natural 
ambiguity, in so far as the definitions "mesh" . For example, one easily shows that our 
second definition, i.e. eq. I4.35[ is equivalent to a natural expression: 

Va,/3 e V* : < J{a),(3> := uj{{J)~\a), {J)- 1 ^)) ■ (4-36) 



It is also straightforward to show that for any bilinear form lu : V x V — > H: if a; 
is antisymmetric of rank r < n = dim(V), then r is even. That is: r = 2s for some 
integer s, and there is a basis e±, e^, e n of V for which u has a simple expansion 
as wedge-products 

uo = E| =1 e { A e i+s ; (4.37) 
equivalently, uo has the n x n matrix 



u = -1 . (4.38) 




where 1 is the sxs identity matrix, and similarly for the zero matrices of various sizes. 
This normal form of antisymmetric bilinear forms is an analogue of the Gram-Schmidt 
theorem that an inner product space has an orthonormal basis, and is proved by an 
analogous argument. 

(5): Symplectic forms: — 
As usually formulated, Hamiltonian mechanics uses a non-degenerate antisymmetric 
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bilinear form: i.e. r = n. So eq. 14.381 loses its bottom row and right column consisting 
of zero matrices, and reduces to the form of Section 14.3. li s naive symplectic matrix, 
eq. 14. 1UI Equivalently: eq. 14.371 reduces to eq. 14.181 

Accordingly, we define: a symplectic form on a (real finite-dimensional) vector space 
Z is a non-degenerate antisymmetric bilinear form uj on Z: uj : Z x Z — > 1R. Z is then 
called a symplectic vector space. It follows that Z is of even dimension. 

Besides, in Hamiltonian mechanics (as usually formulated) the vector space Z is a 
product V x V* of a vector space and its dual. Indeed, this was already suggested by: 

(i) the fact in (3) of Section ILi.2.21 that the canonical momenta Pi '■= jpj transform 
as a 1-form, and 

(ii) Section 14.3. li s discussion of the one-form field VH determining a vector field 
X H . 

Thus we define the canonical symplectic form uj on Z := V x V* by 

aj((v 1 ,a 1 ),(v 2 ,a 2 )) := a 2 {v 1 ) - ai(v 2 ) . (4.39) 

So defined, uj is by construction a symplectic form, and so has the normal form given 
bv ea. QUI 

Given a symplectic vector space (Z,u), the natural question arises which linear 
maps A : Z — > Z preserve the normal form given by eq. I4.1UI It is straightforward 
to show that this is equivalent to A preserving the form of Hamilton's equations (for 
any Hamiltonian); so that these maps A are called canonical (or symplectic, or Pois- 
son). But since (as I announced) this paper does not need details about the theory of 
canonical transformations, I will not go into details about this. Suffice it to say here 
the following. 

A : Z — > Z is symplectic iff, writing ~ for the transpose (eq. I4.20J1 and using the 
second definition eq. 14.351 of the following maps (both from Z* to Z) are equal: 

AoJoA = J ; (4.40) 

or in matrix notation, with the matrix uj given by eq. 14.101 and again writing ~ for the 
transpose of a matrix 

AujA = u . (4.41) 

(Equivalent formulas are got by taking inverses. We get, respectively: A o uj° o A = uj 9 
and AujA = uj.) 

The set of all such linear symplectic maps A : Z — >• Z form a group, the symplectic 
group, written Sp(Z, uj). 

To sum up this Subsection: — We have, for a vector space V, dim(V) = n, and 
Z : = V x V*: 

(i) : the canonical symplectic form uj : Z x Z — ► IR; with normal form given by eq. 

EM 

(ii) : the associated linear map uj° : Z — >• Z*\ which is an isomorphism, since uj is 
non-degenerate; 
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(iii): the associated linear map cu" : Z* — > Z\ which is an isomorphism, since ui is 
non-degenerate; and is the inverse of lo , (cf. eq. 14.35)) . 

We will see shortly that Hamiltonian mechanics takes V to be the tangent space T q 
at a point q G Q, so that Z is T q x T*, i.e. the tangent space to the space T of the qs 
and ps. 



5 Poisson brackets and Noether's theorem 

We have seen how a single scalar function H on phase space V determines the evolution 
of the system via a combination of partial differentiation (the gradient of H) with the 
symplectic matrix. We now express these ideas in terms of Poisson brackets. 

For our purposes, Poisson brackets will have three main advantages; which will be 
discussed in the following order in the Subsections below. Poisson brackets: 

(i) give a neat expression for the rate of change of any dynamical variable; 

(ii) give a version of Noether's theorem which is more simple and powerful (and 
even easier to prove!) than the Lagrangian version; and 

(iii) lead to the generalized Hamiltonian framework mentioned in Section 16.81 

All three advantages arise from the way the Poisson bracket encodes the way that a 
scalar function determines a (certain kind of) vector field. 



5.1 Poisson brackets introduced 

The rate of change of any dynamical variable /, taken as a scalar function on phase 
space T, f(q,p) G 1R, is given (with summation convention) by 

df .idf,.df 

Jt = q w +Pt wr (5 ' 1} 

(If / is time-dependent, / : (q,p, t) G T X M i— > f(q,p,t) G IR, the right-hand-side 
includes a term |£. But on analogy with how our discussion of Lagrangian mechanics 
imposed scleronomic constraints, a time- independent work-function etc., we here set 
aside the time-dependent case.) Applying Hamilton's equations, this is 

d£ = d!£dl_dHdl 

dt dpi dq % dq % Opt 

This suggests that we define the Poisson bracket of any two such functions f(q, p) , g(q, p) 
by 

df dg df dg 
so that the rate of change of / is given by 

f = {/,tf}- (5-4) 
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In terms of the 2n coordinates £ a (eq. I4.6j) and the matrix elements uj a @ of uj (eq. 
I4.13|) , we can write eq. 15.21 as 

| = (d a f)i a = {d a f)u^{dpH) ; (5.5) 
and so we can define the Poisson bracket by 

{/, g} := (d a f)^(d p g) ee . (5.6) 

In matrix notation: writing the naive gradients of / and of g as column vectors V/ 
and Vg, and writing ~ for transpose, we have at any point z = (q,p) G T: 

{f,g}(z) = Vf(z).u.Vg(z). (5.7) 

With these definitions of the Poisson bracket, we readily infer the following five 
results. (Later discussion will bring out the significance of some of these; in particular, 
Section 16.81 will take some of them to jointly define a primitive Poisson bracket for a 
generalized Hamiltonian mechanics.) 

(1) : Since the Poisson bracket is antisymmetric, H itself is a constant of the motion: 

^ = {H } H} = 0. (5.8) 

(2) : The Poisson bracket of a product is given by "Leibniz's rule": i.e. for any three 
functions f,g,h, we have 

{f,h-g} = {f,h}-g + h-{f,g}. (5.9) 

(3) : Taking the Poisson bracket as itself a dynamical variable, its time-derivative 
is given by a "Leibniz rule"; i.e. the Poisson bracket behaves like a product: 

= {f.ri + {/•§}■ MO) 

(4) : The Jacobi identity (easily deduced from (3)): 

{{Lh},g} + {{gJ},h} + {{h,g}J} = . (5.11) 

(5) : The Poisson brackets for the gs, ps and £s are: 

{C,e} = uj^ ; i.e. (5.12) 
{q\ Pj } = 5) , {q\qi} = { Pi , Pj } = . (5.13) 

Eq. 15.131 is very important, both for general theory and for problem-solving. The 
reason is that preservation of these Poisson brackets, by a smooth transformation of 
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the 2n variables (q,p) — > (Q(q,p), P(q,p)), is necessary and sufficient for the trans- 
formation being canonical. Besides, in this equivalence 'canonical' can be understood 
both in the usual elementary sense of preserving the form of Hamilton's equations, 
for any Hamiltonian function, and in the geometric sense of preserving the symplectic 
form (explained in (5) of Section T4.3 .31 and for manifolds in Section 12). 

Note here that, as the phrase 'for any Hamiltonian function' brings out, the notion 
of a canonical transformation is independent of the forces on the system as encoded in 
the Hamiltonian. That is: the notion is a matter of T's geometry — as we will emphasise 
in Section |3 

But (as I announced in Section 14. Ij) I will not need to go into many details about 
canonical transformations, essentially because this paper does not aim to survey the 
whole of Hamiltonian mechanics, or even all that can be said about reducing problems, 
e.g. by finding simplifying canonical transformations. It aims only to survey the way 
that symmetries and conserved quantities effect such reductions. In the rest of this 
Subsection, I begin describing Poisson brackets' role in this, in particular Noether's 
theorem. But the description can only be completed once we have the geometric 
perspective on Hamiltonian mechanics, i.e. in Section |H3J 



5.2 Hamiltonian vector fields 

Section 14.3.11 described how the symplectic matrix enabled the scalar function H on 
T to determine a vector field Xh- The previous Subsection showed how the Poisson 
bracket expressed any dynamical variable's rate of change along Xh- We now bring 
these ideas together, and generalize. 

Recall that a vector X at a point x of a manifold M can be identified with a 
directional derivative operator at x assigning to each smooth function / defined on a 
neighbourhood of x its directional derivative along any curve that has X as its tangent 
vector. Thus recall the Lagrangian definition of the dynamical vector field, eq. 12.81 
in Section 12.21 Similarly here: the dynamical vector field Xh ='■ D is a derivative 
operator on scalar functions, which can be written in terms the Poisson bracket: 

D-=X H = -=q i — +p— = ——-—— = {■ H\ (5 14) 

dt dq 1 l dpi dpi dq l dq i dpi 

But this point applies to any smooth scalar, / say, on T. That is: although we 
think of H as the energy that determines the real physical evolution, the mathematics 
is of course the same for such an /. So any such function determines a vector field, 
Xf say, on T that generates what the evolution "would be if / was the Hamiltonian" . 
Thinking of the integral curves as parametrized by s, we have 

*/-£-{-./>• P.15) 

Xf is called the Hamiltonian vector field of (for) /; just as, for the physical Hamiltonian, 
f = H, Section T4.3. II called Xh L the Hamiltonian vector field'. 



The notion of a Hamiltonian vector field will be crucial for what follows, not least 
for Noether's theorem in the very next Subsection. For the moment, we just make two 
remarks which we will need later. 

So every scalar / determines a Hamiltonian vector field Xf. But note that the 
converse is false: not every vector field X on T is the Hamiltonian vector field of 
some scalar. For a vector field (equations of motion) X, with components X a in the 
coordinates defined by eq. 14.61 

r = X«(0 , (5.16) 

there need be no scalar H : V — > 1R such that, as required by eq. 14.131 

X a = u aP dpH. (5.17) 

This is the same point as in (ii) of Section 14.2.31 that Hamilton's equations have the 
special feature that all the right hand sides are, up to a sign, partial derivatives of a 
single function H — a feature that underpins the possibility of expressing the equations 
of motion by variational principles. 

We also need to note under what condition is a vector field X Hamiltonian; (this 
will bear on Noether's theorem). The answer is: X is locally Hamiltonian, i.e. there 
is locally a scalar / such that X = Xf, iff X generates a one-parameter family of 
canonical transformations. We will give a modern geometric proof of this in Section 
16.51 For the moment, we only need to note, as at the end of Section 15.11 that here 
'canonical transformation' can be understood in the usual elementary sense as a trans- 
formation of T that preserves the form of Hamilton's equations (for any Hamiltonian); 
or equivalently, as preserving the Poisson bracket; or equivalently, as preserving the 
symplectic form (to be defined for manifolds, in Sectional). 



5.3 Noether's theorem 

5.3.1 An apparent "one-liner", and three claims 

In the Hamiltonian framework, the core of the proof of Noether's theorem is very 
simple; as follows. The Poisson bracket is obviously antisymmetric. So for any scalar 
functions / and H, we have 

J TT 

X f (lT) = — = {H,f} = iff = {/, H} = X H (f) =D(f) . (5.18) 

In words: H is constant under the flow of the vector field Xf (i.e. under what the 
evolution would be if / was the Hamiltonian) iff / is constant under the dynamical 
flow X H = D. 

This "one-liner" is the Hamiltonian version of Noether's theorem! There are three 
claims here. The first two relate back to the Lagrangian version of the theorem. The 
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third is about the definition of a (continuous) symmetry for a Hamiltonian system, and 
so about how we should formulate the Hamiltonian version of Noether's theorem. I will 
state all three claims, but in this Subsection justify only the first two. For it will be 
convenient to postpone the third till after we have introduced some modern geometry 
( Section 16. 5j) . 

First, for eq. 15. 181 to deserve the name 'Noether's theorem', I need to show that it 
encompasses Section OJs Lagrangian version of Noether's theorem (despite the trivial 
proof!). 

Second, in order to justify my claim that the Hamiltonian version of Noether's 
theorem is more powerful than the Lagrangian version, I need to show that eq. 15.181 
says more than that version, i.e. that it covers more symmetries. 

To state the third claim, note first that we expect a Hamiltonian version of Noether's 
theorem to say something like: to every continuous symmetry of a Hamiltonian system, 
there corresponds a conserved quantity. Here, we expect a 'continuous symmetry' to 
be defined by a vector field on T (or by its flow). Indeed, a symmetry of a Hamiltonian 
system is usually defined as a transformation of T that: 

(1) is canonical; (a condition independent of the forces on the system as encoded 
in the Hamiltonian: a matter of T's intrinsic geometry); and also 

(2) preserves the Hamiltonian function; (a condition obviously dependent on the 
Hamiltonian) . 

Accordingly, a continuous symmetry is defined as a vector field on T that generates 
a one-parameter family of such transformations; (or as such a field's flow, i.e. as the 
family itself). 

But with this definition of 'continuous symmetry' (of a Hamiltonian system), eq. 
15.181 seems to suffer from two lacunae, if taken to express Noether's theorem, that 
to every continuous symmetry there corresponds a conserved quantity. Agreed, the 
rightward implication of eq. 15. 181 provides, for a vector field Xf with property (2), the 
conserved quantity /. But there seem to be two lacunae: 

(a) : eq. 15.181 is silent about whether Xf has property (1), i.e. generates canonical 
t r ansf ormat ions . 

(b) : eq. 15. 181 considers only Hamiltonian vector fields, i.e. vector fields X induced 
by some f, X = Xf. But as noted at the end of Section I5~21 there are countless vector 
fields on T that are not Hamiltonian. If such a field could be a continuous symmetry, 
eq. I5.18f s rightward implication would fall short of saying that to every continuous 
symmetry, there corresponds a conserved quantity. 

So the third claim I need is that these lacunae are illusory. In fact, a single result 
will deal with both (a) and (b). Namely, it will suffice to show that a vector field X on 
T has property (1), i.e. generates canonical transformations, iff it is Hamiltonian, i.e. 
induced by some f, X = Xf. But I postpone showing this till we have more modern 
geometry in hand; cf. Section IB~5l 
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5.3.2 The relation to the Lagrangian version 



On the other hand, we can establish the first two claims with the elementary apparatus 
so far developed. I will concentrate on justifying the first claim; that will also make 
the second claim clear. 

For the first claim, we need to show that: 

(i) : to any variational symmetry of the Lagrangian L, i.e. a vector field X on Q 
obeying eq. 13.61 there corresponds a vector field Xf on T for which Xf(H) = 0; and 

(ii) : the correspondence in (i) is such that the scalar / can be taken to be (the 
Hamiltonian version of) the momentum px conjugate to X, defined by eq. 13.121 (or 
geometrically, bv I3.31|) . 

It will be clearest to proceed in two stages. 

(A) : First, I will show (i) and (ii). 

(B) : Then I will discuss how (A) relates to the usual definition of a symmetry of a 
Hamiltonian system. 

(A) : The easiest way to show (i) and (ii) is to use the fact discussed after eq. 13.201 
that every variational symmetry X arises, around a point where it is non-zero, from a 
cyclic coordinate in some local system of coordinates. (Recall that this follows from the 
basic "rectification" theorem securing the local existence and uniqueness of solutions 
of ordinary differential equations.) That is, there is some coordinate system (q) on 
some open subset of X's domain of definition on Q such that 

(a) : X being a variational symmetry is equivalent to q n being cyclic, i.e. = 0; 

(b) : the momentum px, which the Lagrangian theorem says is conserved, is the 
elementary generalized momentum p n := J^-. 

So suppose given a variational symmetry X, and a coordinate system (q) satisfying 
(a)-(b). Now we recall that the Legendre transformation, i.e. the transition between 
Lagrangian and Hamiltonian frameworks, does not "involve the dependence on the qs" . 
More precisely, we recall eq. I4.8[ |^ = — Jpr- Now consider p n : T — > 1R. This p n will 
do as the function / required in (i) and (ii) above, since 

PjTT pjj 

X Pn (H) = {H, Pn } = — = -— = 0. (5.19) 

Applying eq. 15.181 to eq. 15.191 we deduce that p n , i.e. the p x of the Lagrangian 
theorem, is conserved. 

(Hence my remark after eq. 14. 8[ that the elementary result that p n is conserved iff 
q n is cyclic, underpins the Hamiltonian version of Noether's theorem; just as the cor- 
responding Lagrangian result underpins the Lagrangian version of Noether's theorem: 
cf. discussion after eq. 13.201 ) 

(B) : I agree that this simple proof seems suspiciously simple. Besides, the suspicion 
grows when you notice that my argument in (A) has not used a definition of a symmetry, 
in particular a continuous symmetry, of a Hamiltonian system (contrast Section l3~2"j) . 
As discussed in Section 15.3. 1| we expect a Hamiltonian version of Noether's theorem 
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to say 'to every continuous symmetry of a Hamiltonian system there corresponds a 
conserved quantity'; where a continuous symmetry is a vector field that (1) generates 
canonical transformations and (2) preserves the Hamiltonian. So the argument in (A) 
is suspicious since, although eq. 15.191 or the left hand side of eq. 15.181 obviously 
expresses property (2), i.e. preserving the Hamiltonian, the argument in (A) seems to 
nowhere use property (1), i.e. the symmetry generating canonical transformations. 

But in fact, all is well. The reason why lies in the fact mentioned in (i), (a) of 
Section 14.11 that every point transformation (together with its lift to TQ) defines 
a corresponding canonical transformation on T*Q. That is to say: property (1) is 
secured by the fact that the Lagrangian Noether's theorem of Section |3] is restricted to 
symmetries induced by point transformations. 

In other words, in terms of the vector field (variational symmetry) X given us by 
(a) in (A) above: one can check that X defines a vector field on T (equivalently: a one- 
parameter family of transformations on T) that is canonical, i.e. preserves Hamilton's 
equations or equivalently the symplectic form. Indeed, one can easily check that, once 
we rectify the Lagrangian variational symmetry X, so that it generates the rectified 
one-parameter family of point transformations: = const, % ^ n; q n i— > q n + e, the 
vector field that X defines on Y is precisely the field X Pn chosen above. 17 

Finally, the discussion in (B) also vindicates the second claim in Section l5.3.11 that 
the Hamiltonian version of Noether's theorem, eq. 15.181 says more than the Lagrangian 
version, i.e. covers more symmetries. This follows from the fact (announced in (i) (b) 
of Section I4.1J1 that there are canonical transformations not induced by a point trans- 
formation (together with its lift). 

In elementary discussions, this is often expressed in terms of canonical transfor- 
mations being allowed to "mix" the gs and ps. But a more precise, and geometric, 
statement is the result announced at the end of Section 15.21 (whose proof is postponed 
to Section |H3J): that the condition for a vector field on T to generate a one-parameter 
family of canonical transformations is merely that it be a Hamiltonian vector field. 
That is: for any scalar / : V — > M, the vector field Xf generates such a family. 

In this sense, canonical transformations are two a penny (also known as: a dime a 
dozen!). So it is little wonder that most discussions emphasise the other condition, i.e. 
property (2): that Xf preserve the Hamiltonian, Xf(H) = 0. Only very special fs will 
satisfy Xf(H) = 0; and if we are given H (in certain coordinates q,p), it can be very 
hard to find (the coordinate expression of) such an /. 

Indeed, when Jacobi first propounded the theory of canonical transformations, in 
his Lectures on Dynamics (1842), he was of course aware of this. Accordingly, he 
pointed out that in theoretical mechanics, it was often more fruitful to first consider an 
/ (equivalently: a canonical transformation), and then cast about for a Hamiltonian 
that it preserved. He wrote: 'The main difficulty in integrating a given differential 

17 Details about point transformations on Q defining a canonical transformation on T*Q, and lifting 
the vector field X to T, can be found: (i) using traditional terms, in Goldstein et al. (2002: 375-376) 
and Lanczos (1986: Chapter VII. 2); (ii) using modern geometric terms (as developed in Sectional, in 
Abraham and Marsden (1978: Sections 3.2.10-3.2.12) and Marsden and Ratiu (1999: Sections 6.3-6.4). 
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equation lies in introducing convenient variables, which there is no rule for finding. 
Therefore we must travel the reverse path and after finding some notable substitution, 
look for problems to which it can be successfully applied'; (quoted in Arnold (1989, p. 
266)). The fact that Jacobi solved many previously intractable problems bears witness 
to the power of this strategy, and of his theory of canonical transformations. 

We can sum up this Subsection in two comments: — 

(1) In Hamiltonian mechanics, Noether's theorem is a biconditional, an 'iff' state- 
ment. Not only does a Hamiltonian symmetry — i.e. a vector field X on V that generates 
canonical transformations (equivalently: preserves the symplectic form, or the Poisson 
bracket) and preserves the Hamiltonian, X(H) = — provide a constant of the motion. 
Also, given a constant of the motion / : T — > 1R, there is a symmetry of the Hamilto- 
nian, viz. the vector field Xf. (Or if one prefers the integral notion of symmetry: the 
flow of Xf). This converse implication, from constant to symmetry, contrasts with the 
Lagrangian framework; cf. the end of Section 13.4.11 

(2) In elementary Hamiltonian mechanics, Noether's theorem has a very simple 
one-line proof, viz. eq. 15.181 

Later, we will return to Noether's theorem. Section 1631 will justify the third claim 
of Section 15.3.11 by showing that a vector field generates a one-parameter family of 
canonical transformations iff it is a Hamiltonian vector field. Meanwhile, we end Section 
121 with a comment about "iterating" Noether's theorem, and the distinction between 
such an iteration and the idea of complete integrability. 

5.4 Glimpsing the "complete solution" 

Suppose we "iterate" Noether's theorem. That is: suppose there are several (continu- 
ous) symmetries of the Hamiltonian and so several constants of the motion. Each will 
confine the system's time-evolution to a (2n — l)-dimensional hypersurface of T. In 
general, the intersection of k such surfaces will be a hypersurface of dimension 2n — k 
(i.e. of co-dimension k); to which the motion is therefore confined. The theory of sym- 
plectic reduction (Butterfield 2006) describes how to do a "quotiented dynamics" in 
this general situation. Here, I just remark on one aspect; which will not be developed 
in the sequel. 

Locally, the rectification theorem secures, for any system, not just several constants 
of the motion, but "all you could ask for". Applying the theorem (eq. 13.211 and I3.22J) 
to the Hamiltonian vector field Xfj on T, we infer that locally there are coordinates £ a 
(maybe very hard to find!) in which Xh has 2n — 1 components that vanish throughout 
the neighbourhood, while the other component is 1: 

X% = for a = 1, 2, . . . , 2n - 1 ; X| n = 1 . (5.20) 

So the coordinates = 1, ...,2n — 1, form 2n — 1 constants of the motion. They 
are functionally independent, and all other constants of the motion are functions of 
them; (cf. point (ii) after eq. I3.22|) . So the motion is confined to the one-dimensional 
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intersection of the 2n — 1 hypersurfaces, each of co- dimension 1. That is to say, it is 
confined to the curve given by: = const, a — 1, 2n — 1, £ 2n = t. 

To this, Noether's theorem eq. 15. 181 adds the physical idea that each such constant 
of the motion defines a vector field X^a that generates a symmetry of the Hamiltonian: 

X^(H) = 0, for a = l,2,...,2ra- 1 . (5.21) 

In this local sense, the "complete solution" of any Hamiltonian system lies in the local 
constants of the motion, or equivalently the local symmetries of its Hamiltonian H. 

To sum up: locally, any Hamiltonian system is "completely integrable". But 
the scare-quotes here are a reminder that these phrases are usually used with other, 
stronger, meanings: either that there are 2n — 1 global constants of the motion or that 
the system is completely integrable in the sense of Liouville's theorem. 

6 A geometrical perspective 

In this final Section, we develop the modern geometric description of Hamiltonian 
mechanics. We will build especially on Sections 14.31 one main aim will of course be to 
complete the discussion of Noether's theorem, begun in Section 15.31 

There will be eight Subsections. First, we introduce the cotangent bundle T*Q. 
Then we collect what we will need about forms. Then we can show that any cotangent 
bundle is a symplectic manifold. This enables us to formulate Hamilton's equations 
geometrically; and to complete the discussion of Noether's theorem. Then we report 
Darboux's theorem, and its relation to reduction of problems. Then we return to 
the Lagrangian framework, by sketching the geometric formulation of the Legendre 
transformation. Finally, we "glimpse the landscape ahead" by mentioning the more 
general framework for Hamiltonian mechanics that uses Poisson manifolds. 

6.1 Canonical momenta are one- forms: T as T*Q 

So far we have treated the phase space T informally: saying just that it is a 2n- 
dimensional space coordinatized by the qs, a smooth coordinate system on the config- 
uration manifold Q, and the ps, which are canonical momenta But we also saw 
in (3) of Section 12.2.21 that at each point q G Q, the Pi transform as a 1-form (eq. 
I2.12|) . Accordingly we now take the physical state of the system to be a point in the 
cotangent bundle T*Q, the 2n- dimensional manifold whose points are pairs (q,p) with 

q eQ, P eT*. 

I stress that from now on, the symbol p has a (fruitful!) ambiguity, between "dy- 
namics" and "kinematics/geometry". For p represents both: 

(A) the conjugate momentum which of course depends on the choice of L; and 

(B) a point in a fibre T* of the cotangent bundle T*Q (i.e. a 1-form or covector); 
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or relatedly: the components pi of such a 1-form: notions that are independent of any 
choice of a Lagrangian or Hamiltonian. 
In more detail: — 

(A) : Recall that in the Lagrangian framework, the basic equations (eq. 12.11 or 
Newton's second law!) being second-order in time prompts us to take the initial q and 
q as chosen independently, with L (encoding the forces on the system) then determining 
the evolution (the Lagrangian dynamical vector field D) — and so also determining the 
actual "realized" value of q at other times as a function of q, and so ultimately, of t. 
Similarly here: Newton's second law being second-order in time prompts us to take 
the initial q and p as independent, with H (encoding the forces on the system) then 
determining the evolution (the Hamiltonian dynamical vector field D) — and so also 
determining the actual value of p at other times as a function of q, and so ultimately, 
of t. Besides, by passing via the Legendre transformation back to the Lagrangian 
framework, one can check that the later actual value of p is determined to equal 

(B) : But p also represents any 1-form (so that pi represents the 1-form's coordi- 
nates). Here, we need to recall three points: — 

(i) : A local coordinate system (a chart) on Q defines a basis in the tangent space 
T q at any point q in the chart's domain. As usual, I write the chart's coordinate func- 
tions as q l . So I shall temporarily denote the chart by [q], so that there are coordinate 
functions q % : dom([g]) — > IR. I write elements of the coordinate basis as usual, as -J^. 

(ii) : The chart [q] thereby also defines a dual basis dq % in the cotangent space T* 
at any q G dom( [<?]). 

(Here I recall, en passant, that the isomorphism at each q between T q and T*, 
that maps the basis element ^ G T q to the one-form dq % in the dual basis, is basis- 
dependent A different basis would give a different isomorphism. Cf. the discussion 
in (1) of Section HSU) 

(iii) : Putting (i) and (ii) together: the chart [q] thereby also induces a local coordi- 
nate system on a neighbourhood of the cotangent bundle around any point (q, p) G T*Q 
with q G dom([g]) and p G T*. 

Putting (i)-(iii) together: the coordinates of any point (q,p) in T*Q in such a coor- 
dinate system are usually also written as (q,p). That is: p is used for the components 
of any 1-form, in the basis dq l dual to a coordinate basis -^j. So, similarly to (i) above: 
I will write this induced chart on T*Q as [q,p]. 

(C) : Taken together, points (A) and (B) prompt a question: 

Why should an evolution from an arbitrary initial state G T*Q have the 

property that: — 

if we choose to express 

(i) its configuration, q say, in terms of an arbitrary initial coordinate system 
[q] on Q, and 

(ii) its momenta ^ in terms of the basis dq dual to the coordinate basis 
at qo- — 
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then 



the states at a later time t have their momenta — which the Lagrangian 
framework tells us must be ^ (cf. (A)) — equal to their components in the 

dual basis to the later coordinate basis, i.e. the coordinate basis j- at the 
later configuration q t l 

In short: why should the state's components in the dual basis of any coor- 
dinate basis continue to be equal, as dynamical evolution goes on, to the 
values of canonical momenta i.e. 4f? 

dq 

A good question. The short answer lies in combining Hamilton's equations for the 
time-derivative of the Pi (eq. 14.5)1 with Lagrange's equations, and with the fact that 
the partial derivatives with respect to q l of the Hamiltonian and Lagrangian, H and 
L, are negatives of each other (eq. 14. 8|) . Thus we have: 



From this it is clear that for any coordinate system, if at t 0l p^ is chosen to equal 
|rf, then this will be so at later times. For eq. 16. II forces their time-derivatives to be 
equal — and so also, their later values must be equal. 

So much for the short answer. We will also get more insight into the relations 
between the Lagrangian and Hamiltonian frameworks in 

(i) the fact, expounded in Section 16.31 below, that any cotangent bundle has a 
natural symplectic structure, independent of the specification of any Lagrangian or 
Hamiltonian function; and 

(ii) some further details about the Legendre transformation, which is further dis- 
cussed in Section l6~71 

6.2 Forms, wedge-products and exterior derivatives 

As I said at the end of Section 14.3.21 this paper can largely avoid the theory of forms. 
For what follows (especially Section IB3j) . I need to recall only: 

(i) the idea of forms of various degrees, together comprising the exterior algebra, 
and equipped with operations of wedge-product and contraction ( Section 16.2.1)1 ; 

(ii) the ideas of differential forms, the exterior derivative, and of exact and closed 
forms (Section T6. 2. 2)1 . 

6.2.1 The exterior algebra; wedge-products and contractions 

We begin by recalling some ideas of Sections 14.3.21 and 14.3.31 Let us again begin with 
the simplest possible case, 1R 2 , considered as a vector space: not as a manifold with a 
copy of itself as tangent space at each point. 



Pi = 



dH 




(6.1) 



If a, (3 are covectors, i.e. elements of (IR 2 )*, we define their wedge-product, an 
antisymmetric bilinear form on H 2 , by 



a A (3 : (v,w) E IR x IR i— > (a(v))((3(w)) - (a(w))((3(v)) G K . (6.2) 
Let us write the standard basis elements of IR 2 as and with elements of IR 2 

oq op 1 

having components (q,p) in this basis; and let us write the elements of the dual basis 
as dq, dp. Recalling the definition of the area form A, eq. 14.161 we deduce that A is 
dq A dp. 

Similarly for IR 2 ™. Recall that the symplectic matrix defines an antisymmetric bilin- 
ear form on H 2n by eq. 14.181 The value on a pair (q,p) = (q 1 , ...q n ;p\, ■■■,p n ), (q',p') = 
(q a , ...q' n ;p' 1 , ...,p' n ) is the sum of the signed areas of the n parallelograms formed by 
the projections of the vectors (q,p), (q',p') onto the n pairs of coordinate planes. This 
is a sum of n wedge-products. That is to say: if we write the standard basis elements 
as and this form is u := £j dq 1 A dpi. It has the action on IR™ x ]R n : 

oq 1 opi oq 1 opi 

In general, if V, W are two (real finite-dimensional) vector spaces, we define: L(V, W) 
to be the vector space of linear maps from V to W\ L k (V,W) to be the vector space 
of k- multilinear maps from V x V x .... x V (k copies) to W\ and L*(V, W) to be the 
subspace of L h (V,W) consisting of (wholly) antisymmetric maps. 

We then define n k (V) := L*(V,B) for k = 1, 2, dim(V), so that tt\V) = V*. 
We also set Q°(V) := IR. fi^V") is called the space of (exterior) k-forms on V. If 



dim(V) = n, then dim(fi fc (V)) = f ^ Y 



The wedge-product, as defined above, can be extended to be an operation that 
defines, for a G il k (V),(3 G £l l (V), an element a A (3 G {l k+l (V). We can skip the 
details: suffice it to say that the idea is to take tensor products as in (3) of Section 
14.3.31 and anti-symmetrize. 

But to complete our discussion of Noether's theorem (in Section IB3|) . we will need 
the definition of the contraction, (also known as: interior product), of a /c-form a G 
Q k (V) with a vector v G V. We shall write this as i v a. (It is also written with a hook 
notation.) We define the contraction i v a to be the (k — l)-form given by: 

i v a(v 2 , v k ) := a(v, v 2 , v k ) . (6.4) 

It follows, for example, that contraction distributes over the wedge-product modulo a 
sign, in the following sense. If a is a fc-form, and (3 a 1-form, then 

i v (a A (3) = (i v a) A (3 + (-l) k a A (i v (3) . (6.5) 

The direct sum of the vector spaces Q k (V), k = 0, 1, 2, dim(V) =: n, has dimen- 
sion 2™. When this direct sum is considered as equipped with the wedge-product A 
and contraction i, it is called the exterior algebra of V, written Q(V). 
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6.2.2 Differential forms; the exterior derivative; the Poincare Lemma 



We extend the discussion given in Section Em to a manifold M of dimension n, taking 
all the tangent spaces T x at x G M as copies of the vector space V , and requiring fields 
of forms to be suitably smooth. 

We begin by saying that a (smooth) scalar function / : M — > 1R is a 0-form field. Its 
differential or gradient, df, as defined by its action on all vector fields X, viz. mapping 
them to /'s directional derivative along X 

df(X):=X(f) (6.6) 

is a 1-form (covector) field, called a differential 1-form. 

The set T{M) of all smooth scalar functions forms an (infinite-dimensional) vector 
space, indeed a ring, under pointwise operations. We write the set of vector fields on 
M as X(M), or as Tq(M); and the set of covector fields, i.e. differential 1-forms, on 
M as X*(M), or as 7^°(M). (So superscripts indicate the contravariant order, and 
subscripts the covariant order.) 

Accordingly, we define: fi°(M) := ^(M); fi x (M) = 7j°(Af); and so on. In short: 
Q k (M) is the set of smooth fields of exterior fc-forms on the tangent spaces of M. 

The wedge-product, as defined in Section 16.2.11 can be extended to the various 
Q k (M). We form the direct sum of the (infinite-dimensional) vector spaces Q k (M), k = 
0, 1, 2, dim(V^) =: n, and consider it as equipped with this extended wedge-product. 
We call it the algebra of exterior differential forms on M, written Q(M). 

Similarly, contraction, as defined in Section 16.2.11 can be extended to Q(M). On 
analogy with eq. 16.41 we define, for a a fc-form field on M, and X a vector field on M, 
the contraction ixot to be the (k — l)-form given, at each point x G M, by: 

i x a(x) : (v 2 , ...,v k ) i-> a(x)(X(x),v 2 , ...,v k ) G El . (6.7) 

The exterior derivative is a differential operator on Q(M) that maps a fc-form field 
to a (k + l)-form field. In particular, it maps a scalar / to its differential (gradient) 
df. Indeed, it is the unique map from the fc-form fields to the (k + l)-form fields 
(k = 1,2, ...,n) that generalizes the elementary notion of gradient / i— > df, subject to 
certain natural conditions. 

To be precise: one can show that there is a unique family of maps d k : Q h (M) — > 
Q k+1 (M), all of which, for simplicity, we write as d, such that: 

(a) : If/G^(M), d(/) = #. 

(b) : d is IR-linear; and distributes across the wedge-product, modulo a sign. That 
is: for a G Vt k {M),f3 G Q l {M), d(a A 0) = (da) A (3 + (-l) fe a A (d/3). (Cf. eq. O) 

(c) : d 2 := d o d = 0; i.e. for all a G tt k {M) d k+1 o d k (a) = 0. (This condition looks 
strong, but is in fact natural. For its motivation, it must here suffice to say that it 
generalizes the fact in elementary vector calculus, that the curl of any gradient is zero: 
VA(V/)eO.) 
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(d): d is a local operator, i.e. for any x £ M and any fc-form a, da(x) depends only 
on a's restriction to any open neighbourhood of x; more precisely, we define for any 
open set U of M, the vector space Q k (U) of A;-form fields on U, and then require that 

d(a = (da) ^ . (6.8) 

To express d in terms of coordinates: if a £ Q k (M), i.e. a is a /c-form on M, given 
in coordinates by 

ot = 0-i^...% h dx l1 A • • • A dx lk (sum on %\ < i 2 < . . . < ik), (6.9) 
then one proves that the exterior derivative is 
da ■ 

da = — dx 1 A dx 11 A • ■ ■ A dx %h (sum on all j and i x , . . . < i k ), (6.10) 
ox 3 

We define a £ fi fc (M) to be: 

exaci if there is a /3 £ fi fc_1 (M) such that a = d/3; (cf. the elementary definition of 
an exact differential); 
closed if da = 0. 

It is immediate from condition (c) above, d 2 = 0, that every exact form is closed. 
The converse is "locally true" . This important result is the Poincare Lemma; (and we 
will use it in Section IB3l s closing discussion of Noether's theorem). 

To be precise: for any open set U of M, we define (as in condition (d) above) the 
vector space Q k (U) of /c-form fields on U. Then the Poincare Lemma states that if 
a £ Q k (M) is closed, then at every x £ M there is a neighbourhood U such that 
a \u £ Q k (U) is exact. 

We will also need (again, for Section 1631 s discussion of Noether's theorem) a useful 
formula relating the Lie derivative, contraction and the exterior derivative. Namely: 
Cartan's magic formula, which says that if X is a vector field and a a fc-form on a 
manifold M, then the Lie derivative of a with respect to X (i.e. along the flow of X) 
is 

Cx& = dixa + ixda . (6-H) 
This is proved by straightforward calculation. 

6.3 Symplectic manifolds; the cotangent bundle as a symplec- 
tic manifold 

Any cotangent bundle T*Q has a natural symplectic structure, which is the geometric 
structure on manifolds corresponding to the symplectic matrix u introduced by eq. 
14.101 and to the symplectic forms on vector spaces defined at the end of Section 14.3.31 
(Here 'natural' means intrinsic, and in particular, independent of a choice of coordinates 
or bases.) It is this structure that enables a scalar function to determine a dynamics. 
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That is: the symplectic structure implies that any scalar function H : T*Q — > 1R 
defines a vector field X H on T*Q. 

I first describe this structure ( Section I6.3.1)) . and then show that any cotangent 
bundle has it (Section 16. 3. 2|) . Later subsections will develop the consequences. 

6.3.1 Symplectic manifolds 

A symplectic structure or symplectic form on a manifold M is defined to be a differential 
2-form wonM that is closed (i.e. du = 0) and non-degenerate. That is: for any x G M, 
and any two tangent vectors at x, o~,t G T x : 

dco = and Vr^O, 3a: u(r, a) ^ . (6.12) 

Such a pair (M, to) is called a symplectic manifold. 

There is a rich theory of symplectic manifolds; but we shall only need a small 
fragment of it, building on our discussion in Section 14.3.31 (In particular, the fact that 
we mostly avoid the theory of canonical transformations means we will not need the 
theory of Lagrangian sub- manifolds.) 

First, it follows from the non-degeneracy of to that M is even-dimensional; (cf. eq. 

EM- 

It also follows that at any x G M, there is a basis-independent isomorphism oj 9 
from the tangent space T x to its dual T*. We saw this in (2) and (4) of Section 14.3.31 
especially eq. 14.231 Namely: for any x G M and r G T x , the value of the 1-form 
uj\t) G T* is defined by 

u\t){g) :=uj(ct,t) VaGT x . (6.13) 

Here we return to the main idea emphasised already in Section 14.3.11 that symplectic 
structure enables a covector field, i.e. a differential one-form, to determine a vector 
field. Thus for any function H : M — > H, so that dH is a differential 1-form on M, the 
inverse of (which we might write as u$), carries dH to a vector field on M, written 
X H . Cf. eq. WM 

So far, we have noted some implications of u being non-degenerate. The other part 
of the definition of a symplectic form (for a manifold), viz. u being closed, du = 0, is 
also important. We shall see in Section T6.5I that it implies that a vector field X on a 
symplectic manifold M preserves the symplectic form u (i.e. in more physical jargon: 
generates (a one-parameter family of) canonical transformations) iff X is Hamiltonian 
in the sense of Section I5~2l i.e. there is a scalar function / such that X = Xf = u\df). 
Or in terms of the Poisson bracket, with ■ representing the argument place for a scalar 
function: X(-) = X f (-) = {-, /}. 

So much by way of introducing symplectic manifolds. I turn to showing that any 
cotangent bundle T*Q is such a manifold. 
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6.3.2 The cotangent bundle 



Choose any local coordinates q on Q (dim(Q)=n), and the natural local coordinates 
q,p thereby induced on T*Q; (cf. (B) of Section lfi~T]) . We define the 2-form 

dp A dq := dpi A dq l := T^ =l dpi A dq l . (6.14) 

To show that eq. 16. 141 defines the same 2-form, whatever choice we make of the chart q 
on Q, it suffices to show that dpAdq is the exterior derivative of a 1-form on T*Q which is 
defined naturally (i.e. independently of coordinates or bases) from the derivative (also 
known as: tangent) map of the projection 

7r:{q,p)eT*Q^qeQ. (6.15) 

Thus consider a tangent vector r (not to Q, but) to the cotangent bundle T*Q at a 
point r) = (q,p) G T*Q, i.e. q G Q and p G T*. Let us write this as: r G T V (T*Q) = 
T( q ,p){T*Q). The derivative map, Drr say, of the natural projection tt applies to r: 

Dn : r G T M (T*Q) h-> (£>tt(t)) G T 9 . (6.16) 

Now define a 1-form 9h on T*Q by 

0ir : r G T fep) (T*Q) h-> p(Dn(r)) G R ; (6.17) 

where in this definition of 9jj, p is defined to be the second component of r's base-point 
(q,p) G T*Q; i.e. r G T fep) (T*Q) and peT*. 

This 1-form is called the canonical 1-form on T*Q. It is the "Hamiltonian version" 
of the 1-form 9l defined by eq. 12.131 and also there called the 'canonical 1-form'. 
But Section 16. If s discussion of the "fruitful ambiguity" of the symbol p brings out 
a contrast. While 9 L as defined by eq. 12.131 clearly depends on L, the definition of 
9 a, eq. I6.17( does not depend on any function H. 9h is given just by the cotangent 
bundle structure. Hence the subscript H here just indicates "Hamiltonian (as against 
Lagrangian) version" , not dependence on a function H. 

So much by way of a natural definition of a 1-form. One now checks that in any 
natural local coordinates q, p, 9h is given by 

9 H =p i dq i . (6.18) 

Finally, we define a 2-form by taking the exterior derivative of 9h' 

d(6 H ) := d^pidq 1 ) = d Pi A dq i . (6.19) 

where the last equation follows immediately from eq. 16.101 One checks that this 2-form 
is closed (since d 2 = 0) and non-degenerate. So (T*Q, d(6>#)) is a symplectic manifold. 

Referring to eq. 14. 18l of Section IPl or eq. !4.39l of Section E.3.31 or eq. 16 .31 of Section 
16.21 we see that at each point (q,p) G T*Q, this symplectic form is, upto a sign, our 
familiar "sum of signed areas" — first seen as induced by the matrix uj of eq. 14. 101 



Accordingly, Section 14.3.31 s definition of a canonical symplectic form is extended 
to the present case: d(9 H ), or its negative — d(9n), is called the canonical symplectic 
form, or canonical 2- form. (The difference from Section T4.3.3F s definition is that on a 
manifold, the symplectic form is required to be closed.) 

(The difference by a sign is of course conventional: it arises from our taking the qs, 
not the ps, as the first n out of the 2n coordinates. For if we had instead taken the 
ps, the matrix occurring in eq. 14 .121 would have been — uj = u~ l : exactly matching the 
cotangent bundle's intrinsic 2-form d(##).) 

We will see, in Section EHO a theorem (Darboux's theorem) to the effect that locally, 
any symplectic manifold "looks like" a cotangent bundle: or in other words, a cotangent 
bundle is locally a "universal" example of symplectic structure. But first we return, in 
the next two Subsections, to Hamilton's equations, and Noether's theorem. 

6.4 Geometric formulations of Hamilton's equations 

We already emphasised in Sections 14.31 andlSlthe main geometric idea behind Hamilton's 
equations: that a gradient, i.e. covector, field dH determines a vector field Xjj. We 
first saw this determination via the symplectic matrix, in eq. 14. 141 of Section l4.3.11 viz. 

X H {z) = ujVH(z) ; (6.20) 

and then via the Poisson bracket, in eq. 15.141 of Section IS~2*l viz. 

D — X ■ d_ _d^d_ _d^d_ , , 

H dt ^ dq 1 ^ % dpi dpi dq i dq { dpi 

The symplectic structure and Poisson bracket were related by eq. 15. 7\ viz. 

{f,g}(z) = Vf{z).u.Vg(z). (6.22) 

And to this earlier discussion, the last Subsection, Section l6~3l added the identification 
of the canonical symplectic form of a cotangent bundle, eq. 16.191 

Let us sum up these discussions by giving some geometric formulations of Hamil- 
ton's equations at a point z = (q,p) in a cotangent bundle T*Q. Let us write a;" for 
the (basis-independent) isomorphism from the cotangent space to the tangent space, 
T* — » T z , induced by uj := — d(0#) = dq l Adpi (cf. eq. I4.35l and l6.13|) . Then Hamilton's 
equations, eq. 14.141 or r6.20[ may be written as: 

z = X H (z) = J(dH(z)) = J(dH(z)) . (6.23) 

Applying cu' 9 , the inverse isomorphism T z — > T*, to both sides, we get 

JX H (z) = dH(z) . (6.24) 

In terms of the symplectic form uj at z, this is (cf. eq. I4.23j) : for all vectors r G T z 

oj{X h (z), t) = dH(z) ■ r ; (6.25) 
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or in terms of the contraction defined by eq. 16.41 with • marking the argument place 
of r E T z : 

i XH cu := u{X H {z), ■) = dH{z){-) . (6.26) 
More briefly, and now for any function /, it is: 

ix f to = df . (6.27) 

Here is a final example. Recall the relation between the Poisson bracket and the 
directional derivative (or the Lie derivative L) of a function, eq. 15.151 and 16.211 viz. 

C Xf g = dg{X f ) = X f (g) = {g, /} . (6.28) 

Combining this with eq. 16.271 we can reformulate the relation between the symplectic 
form and Poisson bracket, eq. I6.22[ in the form: 

{g,f} = dg{X f ) = i Xf dg = i Xf {i Xg u) = u{X B ,X f ) . (6.29) 
6.5 Noether's theorem completed 

The discussion of Noether's theorem in Section 15.31 left unfinished business: to prove 
that a vector field generates a one-parameter family of canonical transformations iff it 
is a Hamiltonian vector field (and so justify the third claim of Section 15.3. lj) . Cartan's 
magic formula and the Poincare Lemma, both from Section 16 .2[ make it easy to prove 
this, for a vector field on any symplectic manifold (M,u). ((M,u) need not be a 
cotangent bundle.) 

We define a vector field X on a symplectic manifold (M, uj) to be symplectic (also 
known as: canonical) iff the Lie-derivative along X of the symplectic form vanishes, 
i.e. C x oj = 0. 18 

Since to is closed, i.e. du = 0, Cartan's magic formula, eq. 16.111 applied to to 

becomes 

C x oJ = dix^-> + ixdu = dix^ ■ (6.30) 

So for X to be symplectic is for ixuj to be closed. But by the Poincare Lemma, if \xoj 
is closed, it is locally exact. That is: there locally exists a scalar function / : M — » 1R 
such that 

i x uj = df i.e. X = X f . (6.31) 
So for X to be symplectic is equivalent to X being locally Hamiltonian. 

18 As announced in Section 12. 2. II I assume the notion of the Lie-derivative, in particular the Lie- 
derivative of a 2-form. Suffice it to say, as a sketch, that the flow of X defines a map on M which 
induces a map on curves, and so on vectors, and so on co-vectors, and so on 2-forms such as uj. 
Nor will I go into details about the equivalence between this definition of X's being symplectic, and 
X's generating (active) canonical transformations, or preserving the Poisson bracket. For as I have 
emphasised, I will not need to develop the theory of canonical transformations. 



So we can sum up Noether's theorem from a geometric perspective, as follows. 
We define a Hamilton system to be a triple (M,uj,H) where (M,u) is a symplectic 
manifold and H : M — > IR, i.e. M G T{M). We define a (continuous) symmetry of a 
Hamiltonian system to be a vector field X on M that preserves both the symplectic 
form, Cx^J = 0, and the Hamiltonian function, CxH = 0. As we have just seen: for 
any symmetry so defined, there locally exists an / such that X = Xf. So we can apply 
the "one-liner", eq. I5.18| i.e. the antisymmetry of the Poisson bracket, 

X f (H) = {H,f} = iff X H (f) = {f,H} = 0, (6.32) 

to conclude that / is a first integral (constant of the motion). Thus we have 

Noether's theorem for a Hamilton system If X is a symmetry of a 
Hamiltonian system (M,u, H), then locally X = Xf and / is a constant 
of the motion. And conversely: if / : M — > 1R is a constant of the motion, 
then Xf is a symmetry. Besides, this result encompasses the Lagrangian 
version of the theorem; cf. Sections 13.41 and 15.31 

Example: — For most Hamiltonian systems in euclidean space IR 3 , spatial trans- 
lations and rotations are (continuous) symmetries. For example, consider N point- 
particles interacting by Newtonian gravity. The Hamiltonian is a sum of two terms, 
which are each individually invariant under these euclidean motions: 

(i) a kinetic energy term K; though I will not go into details, it is in fact defined 
by the euclidean metric of 1R 3 (cf. footnote 4 in Section l2~TJ) . and is thereby invariant; 
and 

(ii) a potential energy term V; it depends only on the particles' relative distances, 
and is thereby invariant. 

The corresponding conserved quantities are the total linear and angular momen- 
tum. 19 

Finally, an incidental remark which relates to the "rectification theorem", that on 
any manifold any vector field X can be "straightened out" in a neighbourhood around 
any point at which X is non-zero, so as to have all but one component vanish and 
the last component equal to 1; cf. eq. 13.221 Using this theorem, it is easy to see 
that on any even- dimensional manifold any vector field X is locally Hamiltonian, with 
respect to some symplectic form, around a point where X is non-zero. (One defines 
the symplectic form by Lie-dragging from a surface transverse to X's integral curves.) 

6.6 Darboux's theorem, and its role in reduction 

Darboux's theorem states that cotangent bundles are, locally, a "universal form" of 
symplectic manifold. That is: Not only is any symplectic manifold (M, u) even- 
dimensional. Also, it "looks locally like" a cotangent bundle, in that around any x 

By the way, this Hamiltonian is not invariant under boosts. But as I said in Section ETO and 
footnote 8, I restrict myself to time-independent transformations; the treatment of symmetries that 
"represent the relativity of motion" needs separate discussion. 
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in M, there is a local coordinate system (g 1 , q n ;p%, ■■■,p n ) — where the use of both 
upper and lower indices is now just conventional, with no meaning about dual bases!— 
in which: 

(i) uj takes the form dq 1 A dpf, and so 

(ii) the Poisson brackets of the gs and ps take the fundamental form in eq. 15.131 
(The theorem generalizes to the Poisson manifolds mentioned in Section |TE1) 

Besides, the proof of Darboux's theorem yields further information: information 
which is important for reducing problems. It arises from the beginning of the proof; 
and will return us to Section E~2l s point that the elementary connection between cyclic 
coordinates and conserved conjugate momenta underpins the role of symmetries and 
conserved quantities in reductions on symplectic manifolds. 

(In fact, Darboux's theorem also yields two other broad implications about reducing 
problems; but I will not develop the details here. The second implication concerns the 
way that a Hamiltonian structure is preserved in the reduced problem. The third 
implication concerns the requirement that constants of the motion be in involution, 
i.e. have vanishing Poisson bracket with each other; so it leads to the idea of complete 
integrability — a topic this paper foreswears.) 

Namely, the proof implies that "almost" any scalar function / G F{M) can be 
taken as the first "momentum" coordinate pi; or as the first configurational coordinate 
q 1 . Here "almost" is not meant in a measure-theoretic sense; it is just that / is subject 
to a mild restriction, that df ^ at the point x G M. 

In a bit more detail: The proof of Darboux's theorem starts by taking any such 
/ to be our p%, and then constructs the canonically conjugate generalized coordinate 
q 1 , i.e. the coordinate such that {g\pi} = 1: so that p\ generates translation in 
the direction of increasing q 1 . Indeed the construction is geometrically clear. The 
symplectic structure means that any such / defines a Hamiltonian vector field Xf, and 
a flow <j)f . We choose a (2n — l)-dimensional local sub manifold N passing through the 
given point x, and transverse to all the integral curves of Xf in a neighbourhood of x; 
and we set the parameter A of the flow $ to be zero at all points y G N. Then for 
any z in a suitably small neighbourhood of the given point x, we define the function 
q 1 (z) to be the parameter- value at z of the integral curve of Xf that passes through 
z. So by construction, (i) / generates translation in the direction of increasing q 1 , and 
(ii) defining p\ := f, we have {q l ,P\} = 1. 

This is just the beginning of the proof. But I will not need details of how it goes 
on to establish the local existence of canonical coordinates, i.e. coordinates such that 
analogues of (i) and (ii), also for i ^ 1, hold. In short, the strategy is to use induction 
on the dimension of the manifold; for details, cf. e.g. Arnold (1989: 230-232). 

To see the significance of this for reducing problems, suppose that there is a constant 
of the motion, and that we take it as our /, i.e. as the first momentum coordinate 
P\. So the system evolves on a (2n — l)-dimensional manifold given by an equation 
/ = constant. So writing H in the canonical coordinate system secured by Darboux's 
theorem, we conclude that = / = — That is, q 1 is cyclic. So as discussed in 
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Section we need only solve the problem in the 2n — 2 variables q 2 , q n ;p2, ■■■,p n - 
Having done so, we can find q l as a function of time, by solving eq. 14.91 bv quadrature. 

To put the point in geometric terms: — 

(i) : The system is confined to a (2n — l)-dimensional manifold pi = a = constant, 
M a say. 

(ii) : M a is foliated by a local one-parameter family of (2n— 2)-dimensional manifolds 
labelled by values of q 1 G / C 1R, M a = \J q i eI M a>q i. 

(iii) : Of course, the dynamical vector field is transverse to the leaves of this foliation; 
i.e. q 1 is not a constant of the motion, q 1 ^ 0. But since q 1 is ignorable, |^ = 0, the 
problem to be solved is "the same" at points x±,X2 that differ only in their values of 

6.7 Geometric formulation of the Legendre transformation 

Let us round off our development of both Lagrangian and Hamiltonian mechanics, by 
formulating the Legendre transformation as a map from the tangent bundle TQ to the 
cotangent bundle T*Q. In this formulation, the Legendre transformation is often called 
the fibre derivative. 

Again, there is a rich theory to be had here. In part, it relates to the topics 
mentioned in Section 14.2.31 (i) the description of a function (in the simplest case 
/ : H — > H) by its gradients and axis-intercepts, rather than by its arguments and 
values; (ii) variational principles. But I shall not go into details about this theory: 
since this paper emphasises the Hamiltonian framework, a mere glimpse of this theory 
must suffice. (References, additional to those in Section I4.2.3[ include: Abraham and 
Marsden (1978: Sections 3.6-3.8) and Marsden and Ratiu (1999: Sections 7.2-7.5, 8.1- 
8.3).) 

Let us return to the Lagrangian framework. We stressed in Section |2~^1 that a scalar 
on the tangent bundle, the Lagrangian L : TQ — > 1R, "determines everything": the 
dynamical vector field D =: Dl\ and so for given initial q and q, L determines a solu- 
tion, a trajectory in TQ, i.e. 2n functions of time q(t),q(t) with the first n functions 
determining the latter. 

For the Legendre transformation, the fundamental points are that: 

(1) : L also determines at any point q G Q, a preferred map FL q from the tan- 
gent space T q to its dual space T* Besides this preferred map: 

(2) : extends trivially to a preferred map from all of TQ to T*Q; this is the 
Legendre transformation, understood geometrically; 

(3) : extends, under some technical conditions (about certain kinds of unique- 
ness, invertibility and smoothness), so as to carry geometric objects of various sorts 
defined on TQ to corresponding objects defined on T*Q, and vice versa. 

So under these conditions, the Legendre transformation (together with its inverse) 
transfers the entire description of the system's motion between the Lagrangian and 
Hamiltonian frameworks. 



I will explain (1) and (2), but just gesture at (3). 

(1) : Intuitively, the preferred map FL q from each tangent space T q to its dual space 
T* is the transition q i— > p. More precisely: since L is a scalar on TQ, any choice of 
local coordinates q on a patch of Q, together with the induced local coordinates q, q 
on a patch of TQ, defines the partial derivatives At any point q in the domain of 
the local coordinates, this defines a preferred map FL q from the tangent space T q to 
the dual space T*\ FL q : T q — > T*. Namely, a vector r £ T q with components q l in 
the coordinate system q l on Q, i.e. r = g*^r (think of a motion through configuration 
q with generalized velocity r) is mapped to the 1-form whose components in the dual 
basis dq 1 are That is 

8 <9r 
FL, : r = g* — £ T 9 ~ — ctf £ T g * . (6.33) 

One easily checks that because the canonical momenta are a 1-form, this definition is, 
despite appearances, coordinate- independent. 

(2) : An equivalent definition, manifestly coordinate-independent and given for all 
q £ Q, is as follows. Given L : TQ — > 1R, define FL : TQ — > T*Q, the /i&re derivative, 
by 

Vg £ Q, Vtr, r £ T q : FL(a) ■ t = | s=0 L(a + sr) (6.34) 

(We here take a, r to encode the identity of the base-point q, so that we make notation 
simpler, writing FL(a) rather than FL((q, a)) etc.) That is: FL(a) - r is the derivative 
of L at a, along the fibre T q of the fibre bundle TQ, in the direction r. So FL is fibre- 
preserving: i.e. it maps the fibre T q of TQ to the fibre T* of T*Q. In local coordinates 
g, q on TQ, FL is given by: 

BT BT 

FL{q\c?) = {q\—) ; i.e. Vl = — . (6.35) 

An important special case involves a free system (i.e. no potential term in the 
Lagrangian) and a configuration manifold Q with a metric g = gij defined by the kinetic 
energy. (Cf. footnote 4 for the definition of this metric: in short, the constraints being 
scleronomous (i.e. time-independent, cf. Section l2~Tj) . implies that for any coordinate 
system on Q, the kinetic energy is a homogeneous quadratic form in the generalized 
velocities.) The Lagrangian is then just the kinetic energy of the metric, 

L(q,q)=L(q):=^ 9l ^ (6.36) 

so that the fibre derivative is given by 

FL(a) ■ r = g(a, r) = g^r 3 , i.e. pi = g^q 3 . (6.37) 

(3) : We can use FL to pull-back to TQ the canonical 1-form Q = Q H and symplectic 
form u) from T*Q (eq. 16. 171 and 16.181 with uo = —AO, from Section IH"3l B). That is, we 



can define 

6 L := (FL)*9 H and lu l := (FL)*tu . (6.38) 

Since exterior differentiation d commutes with pull-backs, Ul = —d&L- Furthermore: 

(i) : As one would hope, 9l, so defined, is Lagrangian mechanics' canonical 1-form, 
which we already defined in eq. 12.131 (and which played a central role in the Lagrangian 
version of Noether's theorem). 

(ii) : One can show that uj^ is non-degenerate iff the Hessian condition eq. !2.3l holds. 
So under this condition, we can analyse Lagrangian mechanics in terms of symplectic 
structure. 

Given L, we define its energy function E : TQ — > 1R by 

V v = (q, t) e TQ, E(v) := FL(v) ■ v - L(v) ; (6.39) 

or in coordinates 

8T 

E(q\q*):=—q l -L(q\q l ) (6.40) 

If FL is a diffeomorphism, we find that E o (FL)^ 1 is, as one would hope, the Hamil- 
tonian function H : T*Q — > 1R which we already defined in eq. 14.41 

And accordingly, if FL is a diffeomorphism, then the derivative of FL carries the 
dynamical vector field 4 in the Lagrangian description, as defined in eq. 12.81 (Section 

EE! (2)), viz. 

D ^*h +i h' (6A1) 

to the Hamiltonian dynamical vector field, viz. 

More generally, one can show if FL is a diffeomorphism, there is a bijective cor- 
respondence between the various geometric structures used in the Lagrangian and 
Hamiltonian descriptions. For precise statements of this idea, cf. e.g. Abraham and 
Marsden (1978: Theorem 3.6.9) and Marsden and Ratiu (1999: Theorem 7.4.3.), and 
their preceding discussions. 



6.8 Glimpsing the more general framework of Poisson mani- 
folds 

Recall that Section lo~Tl listed several properties of the Poisson bracket, as defined by 
eq. 15.31 or 15.61 We end by briefly describing how the postulation of a bracket that 
acts on the scalar functions F : M — > IR defined on any manifold M, and possesses 
four of Section 15. If s listed properties, provides a sufficient framework for mechanics in 
Hamiltonian style. The bracket is again called a 'Poisson bracket', and the manifold 
M equipped with such a bracket is called a Poisson manifold. 



Namely, we require the following four properties. The Poisson bracket is to be 
bilinear; antisymmetric; and to obey the Jacobi identity (eq. I5.11J1 for any real functions 
F, G, H on M, i.e. 

{{F, H}, G} + {{G, F}, H} + {{H, G}, F} = ; (6.43) 
and to obey Leibniz' rule for products (eq. I5.9|) . i.e. 

{F, H ■ G} = {F, H} ■ G + H ■ {F, G} . (6.44) 

This generalizes Hamiltonian mechanics: in particular, a Poisson manifold need not 
be a symplectic manifold. The main idea of the extra generality is that the antisym- 
metric bilinear map that gives the geometry of the state space (the analogue of Section 
14.31 s symplectic form uS) can be degenerate. So this map can "have extra zeroes" , as in 
eq. 14. 371 and 14. 381 (This map is induced by the generalized Poisson bracket, via an ana- 
logue of eq. 15.71 ) This means that a Poisson manifold can have odd dimension; while 
we saw in Section T4.3.3I that any symplectic vector space is even- dimensional — and so, 
therefore, is any symplectic manifold ( Sect ion 16 . 3 . 1 1 and 16 1 6|) . 

On the other hand, the generalized framework has strong connections with the 
usual one. 20 One main connection is the result that any Poisson manifold M is a 
disjoint union of even- dimensional manifolds, on each of which M's degenerate anti- 
symmetric bilinear form (induced by the generalized Poisson bracket) restricts to be 
non-degenerate; so that there is an orthodox Hamiltonian mechanics on each such 
'symplectic leaf. Another main connection is that Section 15.31 s "one-liner" version 
of Noether's theorem, eq. 15.181 underpins versions of Noether's theorem for the more 
general framework. 

This generalized framework is important for various reasons; I will just mention 
two. 

(i) : For a system whose orthodox Hamiltonian mechanics on a symplectic manifold 
(dimension 2n, say) depends on s real parameters, it is sometimes natural to consider 
the corresponding (2n + s)-dimensional space. This is often a Poisson manifold; viz., 
one foliated into an s-dimensional family of 2n-dimensional symplectic manifolds. This 
scenario occurs even for some very familiar systems, such as the pivoted rigid body de- 
scribed by Euler's equations. 

(ii) : Poisson manifolds often arise in the theory of symplectic reduction. For when 
you quotient a symplectic manifold by the action of a group (e.g. a group of symmetries 
of a Hamiltonian system in the sense of Section l6~5~j) . you often get a Poisson manifold, 
rather than a symplectic one. Indeed, the pivoted rigid body is itself an example of 
this. 

But this generalized framework is a large topic, which we cannot go into: as men- 
tioned, Butterfield (2006) is a philosopher's introduction. 

20 Because of these connections, it is natural to still call the more general framework 'Hamiltonian'; 
as is usually done. But of course this is just a verbal matter. 



For now, we end with a historical point. 21 It is humbling, but also I hope inspiring, 
reflection about one of classical mechanics' monumental figures. Namely: a consider- 
able part of the modern theory of Poisson manifolds, including their uses for the rigid 
body and for symplectic reduction, was already contained in Lie (1890)! 
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